Ruby script: Retrieve multiple LinkedIn profiles


Over the weekend, a friend of mine and I were discussing learning how to program and I mentioned that I had tried a few Javascript modules on CodeAcademy. Well a few days ago, I received an email from CodeAcademy that they were starting Ruby classes. I’ve been playing around with Ruby on Rails but never really took the time to learn core Ruby. Anyways I had enjoyed other classes at CodeAcademy and decided to try the Ruby track. I got so engrossed in it that I ended up finishing the whole course in about 48 hours, earning a two day points streak in the process.

Come Monday morning and slow day at work, I was faced with a challenge which I thought might be helped by Ruby code. In a few weeks, I’m helping to organize a networking coffee for a few professionals in DC and to make things more interesting, we are going to ask registrants to provide a link to their LinkedIn profile. The day before the event, I plan to distribute information from their Linkedin profiles in a pdf document so that attendees can be “briefed” prior to the event. Hopefully they will find people with whom it makes sense to network (for business or personal reasons) so that people have don’t have the same conversations that are common at these events which usually include “What is your name? What do you do?, etc.”


I realized that by promising to provide this I would have to manually visit each of these Linkedin Profiles, pull down the fields, type them into a Word document and then generate a PDF. This being 2012 and LinkedIn being an API driven website, I thought this could be automated. I was itching to write the script in Ruby feeling confident that my CodeAacademy training gave me enough background to give this a shot. I first began by going to the LinkedIn API website where luckily enough I found some documented Ruby code that showed me how to connect to the API. However, after registering my “app” with LinkedIn I found that I was struggling to use the native Ruby code. This being ruby, I naturally assumed that there was a gem which encapsulated all the LinkedIn API functionality. Sure enough Wynn Netherland at Github came to the rescue with his aptly named Linkedin gem. I installed it on my laptop and thought I was off to coding bliss.

Unfortunately, a combination of Ruby newness along with a less than complete understanding of the LinkedIn API meant that I struggled for a few hours and googled a lot to figure out how to incorporate the one time manual authorization required by the LinkedIn API into my script. I defintelhy breathed a sigh of relief when I was able to download the first fields on my own LinkedIn profile and realized that my Monday morning dream might actually be possible.

As I worked through my thoughts on this project, I went through several versions of the underlying code. I initially toyed with the idea of saving all of the linkedin profile URLs to a separate file and then reading this file into the script but the more thought about it I wanted a direct connection to the registrants on my Eventbrite page. I continued to dig around until I found a Ruby API for Eventbrite that would let me pull fields from my RSVP list. Since I had set up my Eventbrite registration to ask registrants to enter ther LinkedIn profile information in the Bio field, I was able to pull their Linkedin public profile URL informaton out for those who provided it and for those who provide a bio, I could retrieve the text and output it directly. Integrating the API made the whole process more seamless because everytime someone new registered for an event, his/her details automatically flowed into my script and showed up at the other end in the automagically generated HTML file.

With the I/O taken care of, the remaing items were creating a formatted file and turning the file into a PDF. I began researching ways to print out the fields wrapped in HTML tags so that I could create an HTML web page with the information I wanted. I figured that worst come to worst I would be able to print out the HTML and PDF the website. Since I was going to do the whole project as a single ruby script, I wanted a basic gem that would allow me to write HTML (or a Markdown equivalent) that could be converted to valid HTML and stored in a separate file. Another ruby gem called Markaby turned out to be just what I was looking for. Once the HTML was rendering correctly, I also decided to style it with a css file generated by the Zurb Foundation framework. Pretty quickly I saw the simple field listing transformed into a professional document that I could share with thr attendees at my networking event. However, even though I was generating beautiful web pages, I realized that I would not be able to store this HTML output on the organization’s server. I aslo thought it would be good to have something that could be easily printed out. I began to hunt for a way to convert HTML files to PDF.

Through Googling, I identified several possible gems to use including Prawn, PdfKit and Wicked_pdf. Prawn and Wicked are great for rails projects but found it difficult to wrap my head around integrating them into a single file Ruby script. PdfKit worked reasonably well though formatting wasn’t as clean as I liked. For the project I was working on I decided to automatically generate the HTML from the script and then manually print the PDF. In time, I’m sure easy to use and more flexible gems will be developed.

Using the Scripts

I’ve release the code as open source on Github under the project name eb_linkedin and included basic installation instructions. The code is definitely tailored to my use case, pulling in the fields that I needed for my project and in the order they appeared in my Eventbrite registration fieldset. That said, if its useful you can easily tailor it to your needs.

Outstanding Issues

Thus far, the script works reasonably well and beats having to manually look up information from multiple Linkedin websites. A few caveats however:

  • The script is very picky about the format of the URLs. These have to be URLs in the format or Just looking up a linkedin profile using Google or Linkedin and copying the URL out of the address bar won’t work. If there is no public URL the script won’t work for a particular Linkedin profile.
  • All the data that you are pulling must reside in EventBrite. If something is missing the script will hangup so you’ll find yourself editing the entries of registrants and running the script when errors occur.
  • There are limits to the number of calls you can do using the LinkedIn API. I believe the number is 1,000,000 but if you run a lot of names multiple times you’ll hit the limit and have to re-register.

Most of these issues are caused by limitations in the LinkedIn API which I hope will be fixed in future versions.

Best of luck and let me know if there are issues in the comments below or on github. Thanks for checking it out!

comments powered by Disqus


Wednesday, October 24, 2012

Estimated Reading Time

6 minutes


Previous Article

Next Article

Image credit: nan palmero via photopin cc