My First Python Application – A Word Cloud Based Resume Optimizer

A real world application of Python for Data Visualization that I learnt from recently completed Artificial Intelligence Foundation course.

It has been a couple of months since I started learning Foundations of Artificial Intelligence Program from SkillUp Online. After putting in some dedicated effort on the studies, I successfully completed the well crafted course on last Wednesday. So I thought of applying my learning in a real world scenario by developing a Python application.

Here is my first application of Python programming that I learnt through the course. Well, it is not a CAD/Design AI Application as I mentioned in my earlier post. Rather, I would start with a simple yet really useful application for now. As an aspiring job seeker, I thought of analyzing my resume for relevant keywords by visualizing how prominently they appear in the resume. Moreover, it is a well know fact that Application Tracking Systems (ATS) scrutinize resumes based on keywords and filter them accordingly. So I wanted to analyze the current resume status and optimize it accordingly.

Here is the result I got from the first run of Python program on my resume

Word Cloud Before Resume Optimization
Word Cloud Before Resume Optimization

A quick analysis of the word cloud revealed that my resume does not properly represent my key skills. I wanted to represent my key strengths such as Information Management, Automation, Innovation and VBA in a better way. Other skills such as CAD, Data Control, Design and Engineering are better represented though. After several iterations of the resume, I was able to come up with a balanced one. The word cloud generated by the same application after the resume fine-tuning is as shown below.

Word Cloud After Resume Optimization
Word Cloud After Resume Optimization

And here is the tiny-dirty Python application that I wrote for accomplishing this. Since this is my first program, I haven’t cared much about following best programming practices. I hope to improve that part in future though. Nevertheless, it was really amazing to see what such a small program in Python could achieve! Here is the link to the GitHub Gist page if you want to play with the code.

"""
Word Cloud Resume Optimizer
By Mohamed Haris
GitHub ID: @harismohamed

"""
#install and import required libraries
!pip install python-docx 
!pip install wordcloud 
from docx import Document
from wordcloud import WordCloud, STOPWORDS
import matplotlib as mpl
import matplotlib.pyplot as plt

#set path for resume file
myresume="C:\\temp\\my_resume.docx"

#read the document
doc = Document(myresume)

#collect text from each paragraph
resumetext = "" 
text = []
for para in doc.paragraphs:
    text.append(para.text)
    resumetext = '\n'.join(text)
stopwords = set(STOPWORDS)

#add custom stop words for better results
#custom stopwords may be stored in a text file
custstopwordsfile="C:\\temp\\stopwords.txt"
f=open(custstopwordsfile,'r')
custstopwords=f.read().splitlines()
f.close()

#add custom stop words to the default ones
stopwords.update(custstopwords)

#instantiate a word cloud object
resume_wc = WordCloud(
    background_color='black',
    stopwords=stopwords)

#adjust the image rendering area
fig = plt.figure()
fig.set_figwidth(12) # set width
fig.set_figheight(16) # set height

#generate the word cloud
resume_wc.generate(resumetext)
plt.imshow(resume_wc, interpolation='bilinear')
plt.axis('off')
plt.show()

It would be fun to host this program online for the readers to analyze their own resumes. But right now I don’t have any idea on how to go about it. I shall post an update once it is configured online for public testing. Till then, I advise you to run the code on your system to test the output.

Application of Artificial Intelligence for Early Detection of Pandemic Outbreak

Some random thoughts on how Artificial Intelligence could be used for early identification of pandemic outbreaks

Today I stumbled upon an interesting BBC article titled ‘Treating cancer, stopping violence… How AI protects us‘ and that suddenly sparked a few thoughts on how AI could help us with early prediction of Pandemics/Epidemics in the coming days. A google search helped me with a few initiatives in this regard. I would like to discuss these initiatives first, then talk about my personal views on further opportunities in this direction.

The BBC article mentions two interesting systems in use for early prediction of infectious diseases such as Dengue fever, yellow fever, Zika and chikungunya. The first one named Artificial Intelligence in Medical Epidemiology (AIME) uses case reports pulled in from local hospitals combined with the weather and social factors to predict the outbreak well in advance. Another project from Microsoft called Microsoft Premonition employs scalable monitoring of the environment to detect disease threats early, using robotics and genomics. Their cloud-scale genomic analysis try to identify all the species of organisms and viruses in environmental samples to spot new transmission patterns.

More recently, an article titled ‘AI could help with the next pandemic—but not with this one‘ in MIT Technology Review explains how companies like BlueDot and Metabiota used a range of natural language processing (NLP) algorithms to monitor news outlets and official health care reports in different languages around the world to provide early indications of COVID-19 outbreak. They also effectively applied air travel data analysis for predicting the pace of spread with some reasonable accurate results.

One problem with big data analysis method described above is that not all countries allow transparent sharing of information on news channels or social media platforms. In that case, depending heavily on news/social media analysis for pandemic prediction may draw inconsistent results. We might need to use additional techniques as well to come up with a more reliable result. For example, we could use Computer Vision technology along with Thermal Imaging sensors in crowded places such as Bus stations,Railway stations and Airports to identify abnormal patterns. If we can apply AI techniques such as Natural Language Processing (NLP) , Speech and Computer Vision in multiple areas such as news channels, social media, video/voice communications and crowded places, then we would be in a very good position to track the hints of pandemic at the very early stage itself. Needless to say, early identification and subsequent rapid preventive measures can only ensure the containment of the disease to a small geographical area.

Of course, data privacy is a major concern here. But it is not limited to this case. Most of the AI applications share the same concern due to the heavy volume of data usage. It needs to be mitigated at appropriate levels to ensure the survival of human beings. At some point of time, we need to stop thinking on absolute basis and start perceiving things on a comparative mode to ensure that the rules and regulations do not compromise our existence in this world.