Most Twitter users are not aware of the amount of data that Twitter collects in each tweet. Many probably don’t care. Even social media marketers are not all aware of the “meta-data” that Twitter makes freely available via its APIs to anyone who registers as a developer. Despite its 140 character limit, each tweet carries about ~10X data than seen by the human eye.
Source: Twitter Meta Data and Social Networks Visualisation, Wasim Ahmed (2015)
The anatomy of a tweet
This colourful, if slightly unreadable version was posted by Twitter Employee Raffi Kikorian in the Wall Street Journal in 2010.
Overwhelming? I think so too. Just think, every single tweet carries this amount of meta data.
What’s the big deal?
The ability to access all this data for anyone on Twitter can:
- Analyse your tweets based on # of replies / # of retweets
- Analyse your followers / friends (and similarly identify your unfollowers)
- Analyse other people’s tweets based on keywords, location, # of replies / # of retweets
- Analyse other people’s friends / followers
- Follow / unfollow in bulk.
The large volume of high velocity tweets rich in meta data has made Twitter the darling of hackademics and journalists, allowing them to study public sentiment and compare behaviours within a social network and across different social networks.
Apps like Sleeping Time built by Amit Agarwal can tell you what time a person sleeps based on their last 1,000 tweets.
Happygrumpy.com will assess your mood and tell you how you compare to other Twitter users (apparently I’m happier than Opray Winfrey) and tell you who your celebrity twin is (mine is Emma Watson and Ryan Seacrest! Thank goodness – the last I checked it was Donald Trump!)
Using Machine Learning and Natural Language Processing, you can conduct text and sentiment analysis with a wide range of providers such as Alchemy, Aylien, Indico.
I’ve been playing around with IBM Watson’s personality insights (which is used by employers when hiring) and IBM Watson’s trade off analysis (which is used by financial advisory / insurance companies to determine ones risk profile or risk preference). One hedge fund has even taken to analysing tweet data as the basis of it’s investment strategies.
If you go to IBM’s personality insights live demo site and click on “Analyse My Twitter Personality”, it will will analyse a sample of your tweets. This was what they generated for me.
Critical and opinionated – check.
Energetic and fast-paced – check
Unlikely to be influenced by social media during product purchases – check!
More detail can be found in the bar charts below. I particularly like the callout for Emotional Range – “This demo cannot diagnose a mental illness.” Apparently I have a high emotional range, but it does not mean that I am a positive or happy person. Go figure.
A Sunburst Visualisation can also be generated.
According to IBM’s Very Strong Analysis of my 15,484 words, I’m adventurous and achievement striving but authority challenging! Is it accurate? Perhaps or perhaps not, but this is apparently the personality traits that my tweets portray.
Understanding the data that is publicly available from Twitter’s API
I think it is important that everyone is made aware of the personal information that Twitter collects and makes freely available to “so called” developers. As mentioned before, anyone, can register as a developer and get their hands on this meta data.
If you have your tweets geo-location enabled, Twitter knows exactly where you are. And so does anyone else accessing the Twitter API data, especially if you use earlier versions of Twitter for IOS or Android. This is not the same as declaring a location on your profile. Each tweet can have location data with GPS coordinates attached to it.
This is directly from Twitter’s FAQ section:
If you choose to toggle on the “Share precise location” button (available on Twitter for iOS version 6.26 or later, and on Twitter for Android version 5.55 or later), your precise location (latitude and longitude) will be associated with your Tweet and findable via API.If you Tweet using an earlier version of Twitter for iOS or Twitter or Android, every Tweet you geotag will include your device’s precise location (latitude and longitude) which can be found via API.
Third-party applications or websites may let you Tweet with location, including your precise location. We ask these developers to clearly explain what information is being shared when you use their products to Tweet with location.
I don’t like Twitter knowing that much about me, and nor do I like the fact that my data is made available to anyone who wants it.
If you follow a structured and predictable schedule, it would not be difficult for anyone to determine what that schedule is, and use that information to no good. You can remove it by following these instructions:
What about you? What do you think about the data you can get from Twitter’s API? If you want to play around with Twitter API data yourself, check out Part Two of this blog post “How to Access Twitter’s API: A Guide for Non-Developers”