Rewind
In the previous post, I explained the process of web scraping job descriptions from a certain website. After collecting and cleaning the data, some NLP models were developed in order to detect skills, knowledge, minimum experience and levels (degree) in job descriptions.
The models
Manual annotation model
After some research, me and my group decided to pursue with a NER model from the spaCy library. To feed the model we had to perform manual annotation by using a free annotation software called docanno. We manual labelled around 200 job descriptions.
Automated labelling model
This model isn't capable of differentiate the different labels (skills, knowledge, minimum experience and levels), basically because it is only taking the skills from a dictionary and matching those with the job descriptions.
Entity Ruler
Very similar to the Automated labelling model, the only difference is that the data isn't trained, so basically it is only looking for the words in the job descriptions, but it is not taking into account the position of the word in the sentence.
Examples
Let's focused on the manual labelling, since it is the one that allows us to see the different labels.
Blockchain Developer
Web Developer
Data Scientist
Final Thoughts
More data annotation would definitely improve the model, we can clearly see, that some entities are still missing in the examples above.
These models can be applied on several use cases, such as: helping HR to tackle the right candidate, helping the job seeker to find his perfect match, etc, it's up to your imagination!
You can find the code on my github:
Congratulations @macrodrigues! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):
Your next target is to reach 300 upvotes.
You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP
Check out the last post from @hivebuzz:
Support the HiveBuzz project. Vote for our proposal!