I recently wrote a post on hacking together a linkbuilding tool where I set myself a challenge of learning a bunch of new technologies in 2 hours in order to be able to build a basic linkbuilding tool. I learnt just enough YQL, xpath, Python and Google App Engine to do the job. Since then I've put this to use in at least one tool that's actually helping me and my team do our jobs better.
Inspired by this (and encouraged by Kate Morris, a recent addition to the Distilled team), I started putting together a cheatsheet of the basic YQL and xpath I had learnt. In the end, it turned into that plus inspiration of APIs and datasets that could make great starting points for tools (either for research or for creating linkworthy content):
Download it: API and data cheatsheet
Or link to it: API and datasource cheatsheet [PDF]:
<a href="https://moz.com/blog/api-and-dataset-cheatsheet-building-quick-dirty-tools">API and datasource cheatsheet</a> [<a href="https://www.distilled.co.uk/blog/wp-content/uploads/2010/05/api-data-cheatsheet.pdf">PDF</a>]
Or tweet it!
I wanted to create the kind of thing that I'd find useful to have around for inspiration and quick memory-jogs. So I focused on three areas:
Sources
APIs
I have been enjoying digging through Programmable Web to find great APIs that do cool things. The two I'm currently most excited about are:
- Face.com - just for pure awesomeness. I haven't actually tried it yet, but a face recognition API? Are you kidding me?
- Alchemy - for the time-saving ability of extracting visible text from a page. This is the kind of thing I don't want to have to code myself for sure.
Data sources
In addition to tools that do cool things, sometimes you need input data. Some of the APIs are designed to give you data, others manipulate data, but sometimes you just need that raw data. In addition to being one of the coolest names around (maybe I'm just a sucker for chimps), infochimps, which catalogues data sets around the web, is perhaps also one of the coolest sites on the web. With everything from the 1,000 most frequently used English words to Trst Rank for Twitter users [data] (check out their big datasets if you really want to get your hadoop on).
Magic
As I discussed in my last post, I'm not a developer. My code is testament to that. I therefore love stuff that makes my life easier. Re-using work that other smart people did was cheating at school, but is a hugely valuable life skill when you are actually trying to get real stuff done. There are a small number of bits of syntax for YQL and xpath that I keep needing to look up, so I included them in the cheatsheet.
Horsepower
You could do all this stuff yourself. Or you could get a computer to do it. The final column outlines the tools I have used to for different kinds of tasks:
- Mozenda: best for one-off site scraping and rapid proof-of concept
- 80legs: best for rapid development of well-defined tasks
- Google App Engine: best for combinations of ease-of-use and flexibility. Great for accessing APIs. Better for beginners than:
- Amazon Web Services: best for experts and production code
Sometimes things just have to be done by humans, but that doesn't mean it necessarily has to be you doing it. I have included some links to my favourites, but Rand's post on outsourceable SEO tasks is the place to start reading for an introduction.
Inspiration
One of the sources of inspiration for this post has been reading on DataWrangling about the work of Peter Skomoroch who is a research scientist at LinkedIn (and whose delicious links are included in the cheatsheet). I love this presentation on the creation of TrendingTopics.org:
At some point, I will loop back around and update this with more API links etc. in the meantime, another API I've come across is the Wordstream API which gives a load more keyword juicyness to your API fun.
If you liked this, I'd love a tweet or a link: API and datasource cheatsheet [PDF]:
<a href="https://moz.com/blog/api-and-dataset-cheatsheet-building-quick-dirty-tools">API and datasource cheatsheet</a> [<a href="https://www.distilled.net/wp-content/uploads/2010/05/api-data-cheatsheet.pdf">PDF</a>]
Thanks for including us in your article, Will. Glad you like the name =)
We're launching the Infochimps API (https://api.infochimps.com) very soon here (less than 3 days), so soon you'll have access to data via API instead of just raw in a CSV.
We'll have the trstrank data available first and soon to follow with other metrics such as the words users use the most as well as metrics for the number of @ replies in/out and retweets in/out for a Twitter user. Soon to follow after that with a lot of other cool data like US Census Demographics.
- Jesse @ infochimps
*brain explodes*
I'm not sure what it is about web development that scares the hell out of me but I panic!
Perfect post for me then methinks!
Thanks a lot.
404 page on the download link. Is there an alternative to download?
Thanks,
Rodrigo
It's not really an API, but for crawl related tools and data, Nutch or Nutch WAX are open source tools you might find useful.
Would like to see the Compete and Alexa APIs included in the next edition. Competitive intelligence is crucial. Other important candidates include the Google AJAX Search, Bing, and Yahoo Search APIs.
Great post Will!
Just so happens I've spent the whole weekend researching APIs for use with online SEO tools and I've got a long list to get started on now!
One sumbling block I hit was with search engine APIs. Looks like the Terms of Use for all the APIs forbid you from using them in online tools.
What I'd like to do is return the top 5 results for a search query as the basis of the tool - but all the search engines (understandably) consider this scraping and block you :-(
Anywaym, thanks for the cheatsheet - hopefully soon I'll be able to write my own post about the work I'm doing with the APIs!
Thanks,
Nick
One of the best posts i've seen about this topic. I can't find much info about working with API's. I've just written a YouMoz post (hoping it gets published) with a quick and dirty app I made using the SEOmoz API. I especially liked the Alchemy API - haven't seen it before and it looks very interesting. What do I do with all these tools out there? So many ways to spend my time... (and sometimes money) ..
Cool stuff, though it's been a while since I myself wrote any serious code. I have forwarded the link to this post to my programmers though. :)
The same here... sent to my fellow devs who asked me to thank you Will expressely: and here I am.
About me, I promised myself to save the links in a special "Want to be a dev" favorite's carpet.
Same for me, Thanks Will. The programmers get giddy when they get new APIs to play with!
Ive been crying out for ages for someone to put together a "quick n dirty" cheat sheet for 80legs...
Im using it, but nowhere near the extent that I should be doing. C'mon seomoz, get some posts out on this excellent tool for the good of the community.
Oh - and see you all at SMX advanced tomorrow (got meetings today so wont make it)..
I agree MOGmartin,
80legs looks like a great tool. I have used mozenda quite a few times to get the idea together, but I have looked at 80legs thinking -0h another cool tool- I found it based on "the facebook whisperer" but I have found mozenda to be easier for a point and click guy like myself.
I should also say, "will critchlow, you may be my next hero" great ideas. Will you be at the Google IO tomorrow?