Fetching a Flask API from a Chrome extension
One of the most powerful things about an API is that it enables applications to communicate with one another, such as Flask to a Chrome extension.
In this finale of our fake jobs detector trilogy, we are going to create a Chrome extension that will extract a job description from a job site (Indeed), and predict the legitimacy of the advert using our model in our previously created Flask API.
In this blog post, we would go through:
- Creating a simple chrome extension
- How to identify text we want to extract
- Passing text to our Flask API
- Displaying the result we have retrieved from our NLP text classifier
Manifestation of a Chrome extension
First off, to create a Chrome extension, we have to create the manifest.json file. The manifest file essentially gives information about the extension to the browser. This includes things like its’ name, scripts, and the permission it needs.
In the manifest file above, what we’re basically doing is:
- Set the extension name to “Fake jobs Analyser Plugin”
- Display a default icon using icon.png
- Set our background script (to manage events in the background)
- Set our content script (runs in the context of the web page)
- Run the content script when the website URL matches https://*.indeed.com*/
Now that we’ve created our manifest file, we should create a folder structure like below:
Once we’ve created our files, we can navigate to <chrome://extensions/> (copy-paste in the search bar) to load our Chrome extension. Here, we would just need to enable “developer mode” and load our unpacked chrome extension into our browser.
After chrome extension is loaded in, you would see something similar to below on your Chrome extension toolbar:
Now that you can see your Chrome extension activated, you can congratulate yourself! You’ve made your first chrome extension that does… nothing (yet). Let’s start programming our features in to make it useful!
Identifying the information to extract
Before we start adding features for our Chrome extension, we first have to identify the HTML element of our job description in Indeed. This is to so that we can extract the text and post it to our API.
For our example, we’re going to use this bioinformatician/data scientist advertisement that I found. I’ve also removed the details about the company to ensure we don’t get in trouble if anything happens.
If you want to follow along, just navigate yourself to Indeed.com and find a job ad of your liking, right-click, and open it in a new tab. You should be navigated to a URL similar to this: https://au.indeed.com/viewjob?jk=somethingSomethingNotARealURL.
Once you’ve navigated to the job posting of your choice, right-click on the job description and select “inspect”. This would open up the Chrome Dev Tools that would allow us to see the elements of the page.
As our job descriptions lie within the “jobDescriptionText” element, we’ll extract the text from it and feed it to our prediction model.
Extracting text from the content page
To extract the job details from the website, we first have to create a content script. The content script is needed so that we can read details from the website’s DOM. We can do so by first, extracting the job details from the HTML element:
Next, we create a function (similar to an API call) that would help us delegate our API call to the background script (background.js). The reason we do so is that there are problems with Cross-Origin requests. To prevent the leak of sensitive information, most web pages will prevent us from calling an API in the content script.
As so, we will delegate the API call to the background script as it is not affected by CORS. In this case, we can use chrome.runtime to send information to the background to perform the API call.
In our background script, we’ll add a chrome.runtime listener which will wait for any information coming from the content script. And if it does retrieve any data, it will fetch the API embedded in the data.
Coming back to our content script, we’ll add one last function for adding our “Job Details”. This would add information on the legitimacy of the job advert. This would be a simple as we are just concatenating HTML elements.
Finally, we also want to ensure we only perform our script if we’re on an appropriate URL. So we’ll call do a simple if statement to determine if our URL contains “indeed.com/viewjob”.
We’ll then call the functions we previously built: get the job details, make a POST request to our Flask API, and then append the results into the job advert.
Taking a look at our Chrome extension results
Now that we’ve completed our script, refresh our extension in <chrome://extensions/> and refresh the job advert page. Depending on how fast your model can make predictions, we should be able to see in a few seconds, a huge glaring statement stating if the job is real or not.
And just to ensure that our model is not just returning one constant prediction, I found an example where it predicted a fake job advert. Just to clarify, our model predicting this job advert as fake DOES NOT mean that the job advert is actually fake, this could just be the case of Type I error.
Besides, this job advert could also have been classified as fake due to how poorly written it is. There are minimal details about what the job entails and it’s just filled with “Special Conditions” required for the applicant.
In conclusion, creating a simple Chrome extension is fairly easy, with the only difficulty being passing a text from the content to the background script. It was a small hassle but a hassle nonetheless.
Now that we’ve completed our Chrome extension, we can continue this piece of work by optimising our Naive Bayes text classification model, extending the extension to other job sites, and possibly publishing it to the Chrome store.
While doing above is no easy feat, I’m sure we can just Google/StackOverflow it, just like a wise programmer once said:
My boss tells me that my skills are very valuable to the team. I am literally googling all day long everyday. If I wasn’t, I’d get fired.– lezorte, reddit