Using transit to visualize income inequality and census data

“New York City has a problem with income inequality. And it’s getting worse—the top of the spectrum is gaining and the bottom is losing. Along individual subway lines, earnings range from poverty to considerable wealth. The interactive infographic here charts these shifts, using data on median household income, from the U.S. Census Bureau, for census tracts with subway stations.”

http://projects.newyorker.com/story/subway

 

This New Yorker interactive is one of my recent favorites. The goal of this presentation of census income data goes beyond simply mapping income across the city. It presents income as a simple line chart, but in a more direct visual comparison than a heatmap would have done for the same information. The structure of the subway is used to orient the viewers and point them to the proximity of inequalities within the city. The differences between stops are dramatic, and their proximity is familiar to those who ride the subway.

This was a really effective presentation of data for me because of how prominently the subway system and other systems of transportation figures into our daily lives and serves to orient the way we see a city. Using only the tracts that have subways stops in them also eliminated a large part of the city. While this may seem to be a limitation at first, it actually serves to high-light income inequality, leaving a stronger impression of the data that inspires further investigation that can be applied to a larger area. It is also especially effective because it hints at the story of how infrastructure is tightly intertwined with income and that how the building of public transportation drives changes in income.

The readers of the magazine are the intended audience for this graphic, but it also expands the audience of the paper magazine to those with general interest in the city. There is a strong possibility of this visualization being used to present issues of inequality by community leaders, those interested in the changing landscape of real-estate by choose to use this format to map historical income data as well as projections of growth in the estimates given by the census.

Screen Shot 2015-02-05 at 2.56.56 PM

The grid in the background of this graph is categorized into the 3 boroughs that the train travels through. I think that the whole name of each borough could have been spelled out instead of using the 3 letter abbreviations. Overall the borough division shows the drastic increase in income in Manhattan versus other part of New York City and is especially effective. It would be harder to do this in other cities where inequality is less clearly delineated by geographic region.

For the L train, which bisects Manhattan before going across the river to Brooklyn(map lower right), the decline in income is especially clear(graph lower left).

Screen Shot 2015-02-05 at 3.22.33 PM Screen Shot 2015-02-05 at 3.22.38 PM

 

 

a day of data: 2/5/15

-8:30: my “eat breakfast, you degenerate” alarm goes off. my phone is synced with the cloud and collects metadata about my usage.  I grudgingly wake up and check my email while lying in bed (server access data; gmail usage; reply/deletion actions are recorded, and whoever I reply to knows I wake up around 8:30).

-8:50: i use the bathroom; i assume East Campus’ water usage is monitored in aggregate, so mine becomes part of the total.  to avoid waking my roommate, i do my hair and makeup in the communal bathroom instead of our room. my aim is to avoid creating any stimulus that will wake her, whether that be light or sound.  does that count as data?

-9:00: breakfast. i recycle the empty soymilk container and the box of cereal i just finished; to an enterprising investigator, trash could be considered a form of aggregate data about my hall’s eating habits.

-9:20: i head to class with my friend.  we leave footprints in the snow; my shoe size, footprint, and gait pattern are probably individually identifiable.  my phone has GPS enabled because my friends and i installed an app that pings my location to them, but i have (as i often do) left my phone in my bed.

-10:00: i ask a question in class. the girl next to me writes down the professor’s answer in her color-coded latex-ed notes.  i am also knitting a hat in class; its length is a linear function of time spent not completely engaged in class. one of my friends sitting near me sends me an email about my hat.

-11:00: i arrive at 6.046 lecture, which is being taped.  i spend the lecture knitting.  i don’t think i’m in the camera’s field of vision, though.

-12:30: i go to the course 6 lounge, using my ID to access, and make coffee.

-1:00: Japanese class.  i take a quiz; my score will presumably live in a spreadsheet somewhere and be used to calculate my grade.

-2:00: finally, a break.  i use my MIT ID to access my dorm, check my email, reblog a few posts on tumblr, and access facebook.  i don’t like anything, but i do click several links.  i leave my dorm and use my debit card to buy food at the food truck by MIT Medical.

-3:00: i head to CMS.631.  I check out links on my computer (internet browsing data), contribute points and ideas to the posters on the walls, which are going to be used to shape the course of the class.

-4:30: i go home (ID for access again).  my roommate is still asleep.  our electrical usage, which i have heard is tracked by room, is negligible except for the heater and various chargers for the day.

-5:00: i browse the internet and eat random food that belongs to me and i found in the freezer.  it has been a fixture in the freezer for a long time; the next person looking for food might be perplexed that a landmark they’ve come to rely on is gone.  there’s probably a ton of browsing data, tumblr reblogging, and email replies/deletions/reads.

-6:00: i decide it’s a great idea to work out instead of taking a nap. i used to use an app that tracked the miles i ran and the speed at which i ran; since it encouraged me to run as far as i could (and therefore get overuse injuries) i bring my phone with me only to listen to music. GPS is enabled, so my friends, if they wanted to know, are aware of my location.

-6:50: i have a 7 pm class.  i grab clothes from the shelves.  my roommate, if she was nosy (she’s super awesome and probably wouldn’t pry into my life that much), could deduce that i’ve worked out (gym clothes in my laundry bag and the shelves where my clothes live are a mess because i couldn’t find pants)

-7:00: i attempt to get into the Media Lab. i don’t have card access.  someone somewhere knows i tapped my card unsuccessfully about 3 times.  someone with access lets me in.

-7:45: my hat is longer. i’ve clicked several links on the class subreddit.

-8:30: we attempt to eat dinner with the housemaster.  card access to the west parallel of east campus.  all the food is gone. he is perplexed to see us.

-8:50: my friend invites me to dinner at maseeh.  i tap my ID again.

-10:00: i begin to attempt homework. many, many wikipedia pageviews, mostly linear classification and the perceptron algorithm. somehow i also end up reading up on the use of singular “they,” gender in news reporting, and ethics….meanwhile i’m listening to music on either pandora or youtube, who are definitely collecting data about my listening patterns and preferences, which are way more mainstream than i’ll ever admit to.  my youtube homepage is deeply embarrassing.

-12:00: i write out Japanese vocabulary on the chalkboard in the hallway.  people walking by after i go to sleep will know i was studying the meaning of “to see, honorific” and “space alien,” probably for a quiz tomorrow, because i talk a lot about how terrible i am at studying vocabulary.

-1:30: more tumblr; likes and reblogs. the books scattered on my desk explain, roughly, what i’ve been working on tonight. i write myself notes on my hand about the things i need to get done tomorrow. i set alarms to wake tomorrow, reply to some last emails, and fall asleep. the fact that the light is off in the room and the person-shaped lump in the corner inform my roommate that i’m trying to sleep, so she’s super stealthy when she comes in. those are the best kinds of data-driven decisions.

 

data log: Laura

My data log for Sunday Feb 8th.  The list includes only “collected” data–does data exist before it is recorded? (if a tree falls in the forest, does it make a sound?).

Some examples of non-recorded/non-observed data that I created include: vital signs, sleep habits, eating habits, actions, trash generation, movement patterns, items in my environment (e.g furniture), time use, products/consumables used, sewing machine use, radio use, newspaper/book reading speed, typing speed.

I also noticed that the amount of data I created on a very quiet weekend day at home was significantly less than the amount of data I created on a weekday working at school. Interacting with society creates more information!

Collected data includes:

–electricity use, gas use in the apartment
–gmail use throughout the day: receivers of messages, contents of messages, timing of receiving/sending/reading messages, amount of time spent per message
–online activities (chrome): sites visited, length of visit, times of visit, content, followed links, recirculated links…analytics blocker stops some of this data from collection?
–gchat conversations: person spoken too, content, timing,
–texting & calling friends & family: time of exchange, length of exchange, contents, rough location? (tracking is off, but cell towers or other methods?)
–google calendar: data of event, reshuffling of events, location of events

Couples Text Messages are Decoded

In a recent newsletter article sent by the Parisian website Merci Alfred, Les SMS des couples déchiffrés (can be translated by Couples Text Messages are Decoded) shows within a few infographics how texts as part of the couple new language. It gives stats and possible trends on couples texting behaviors in a humorous way. Over 100 millions text messages of couples have been analyzed with the help of Tx.to a website that allows you to print your SMS conversations.

The figures are split in Gender behaviors and questions asked are : % of sent texts according to the status of the relationship, the day of the week, length of sent texts, most frequently used words in texts, most used emojis, first ” I love you” and “make love” are said, time of response between texts, etc.

The goal of the data presentation seems to show that you will have a different behavior in your relationship according to your gender. Even though they claim a study with over 100 millions texts, the audience understands that the point is not to run a scientific study but rather show stats in a funny way. As each graph is almost always annotated to highlight that difference. For instance, when we see the time of response between texts of 2:30min for Women VS 4:30min for Men the annotation says : c‘est parce qu’on s’applique ( it is because we try harder)

The data presentation is effective because it distinguishes Gender with a different color code and show simple binear comparisons with only a limited figures per graph. Males and Females behave differently. We all know that.

Those who saw the article are the recipients of a newsletter that targets urban males living in Paris city. But it is also shown on their website. So it really aims at not a specific male audience but mostly an urban audience,  fairly young 18-35yo.

Val’s Data Log (2/8/15)

  • woke up, notified sleep tracking app on my Android phone device
  • checked email, social media (replied to several emails and one Facebook message)
  • checked my living group’s meal plan signup Google spreadsheet
  • looked at Google Calendar for the upcoming week
  • checked CMS.619 syllabus (Google Doc) for clarification on assignment
  • clicked hyperlink to class blog (to read others’ blogs)
  • checked wunderground.com in anticipation of the snowstorm (data on website sourced from many measurement sites, my visit logged)
  • read several articles posted on Facebook.com by friends (clicks and likes logged by Facebook, views of articles logged by their respective websites)
  • logged 2 books in my goodreads.com “to-read” queue
  • showered (total water use measured by City of Cambridge)
  • brushed teeth several times during the day (water use)
  • used the bathroom several times throughout day (water use)
  • ate communal food at living group (all communal food purchased with house card, logged by financial services corporation (ie: Visa/AmEx/etc.), total food use measured weekly by “stewards” who purchase food)
  • heated food on gas stove (amount used measured by gas supplier)
  • used various electronic devices (electricity used is measured by electricity supplier)
  • logged TA work hours for this week on MIT’s Atlas website
  • listened to music on Bandcamp.com (plays logged by site)
  • listened to music on Grooveshark (data both logged by site and sent to my Last.fm account)
  • worked on several assignments on LibreOffice (data stored on my computer)
  • reblogged 2 Tumblr posts, added 14 to queue, liked 5 (data logged by Tumblr)
  • printed assignments and readings for lab class (documents downloaded to computer from course site and data sent to printer)
  • spent most of the day with my cell phone in my pocket (data usage and location tracked)
  • spent time with housemates, most of whom also have data and location-tracking phones
  • notified sleep tracking phone app that I was going to bed

Data Log (ceriley)

Sunday, February 8th 12:00am to Monday, February 9th 12:00am

Italicized are the activities that didn’t produce data that were recorded digitally, but they’re activities that I would report if someone asked me to fill a survey about my day. Everything is chronological but times are approximate.

  • 12:00am — used computer, connected to MIT wifi (Tumblr, Gmail)
  • 12:20am — read Yes, Please 
  • 1:20am — used water to brush teeth, etc.
  • 1:30am — went to bed
  • 10:15am — woke up without an alarm
  • 10:30am — used water to brush teeth, etc.
  • 11:20am — made a 33 minute phone call home
  • 12:00pm — used water and electric stove to cook lunch
  • 12:30pm — mobile google search: amazon hamburger earmuffs
  • 1:00pm — used computer, connected to MIT wifi (Stellar, MIT Admissions Blogs, Gmail)
  • 1:00pm — sent an email
  • 1:10pm — new Adobe InDesign document
  • 1:30pm — google search: worldview
  • 1:30pm — replied to an email
  • 2:00pm — used water to wash dishes
  • 3:00pm — used computer, connected to MIT wifi (Tumblr, Twitter)
  • 3:00pm — sent an email
  • 3:05pm — skimmed Wikipedia article (Japanese mobile phone culture)
  • 3:10pm — google search: prerecorded
  • 3:10pm — google search: ENT
  • 4:00pm — mobile google search: how much milk in a milk shake
  • 4:05pm — called dorm elevator and rode it down 11 floors
  • 4:10pm — paid for ice cream and milk with TechCash
  • 4:15pm — scanned my MIT ID at 2 card readers to gain access to dorm, called dorm elevator and rode it up
  • 4:20pm — used electric blender to make milkshakes
  • 4:30pm — sent 4 text messages
  • 4:30pm — sent a snapchat
  • 5:00pm — used computer, connected to MIT wifi (Stellar, Meyer Lab, MIT+K12 Videos)
  • 5:30pm — updated InDesign document and saved
  • 6:00pm — exported a video from Adobe Premiere and uploaded to shared Dropbox folder
  • 6:15pm — replied to an email
  • 6:20pm — read printed book chapters and took notes
  • 8:45pm — used computer, connected to MIT wifi (Tumblr, YouTube, Facebook, Twitter, MIT Emergency)
  • 9:00pm — watched 4 new YouTube videos in my subscriptions
  • 9:20pm — watched and liked a video on Facebook
  • 9:30pm — read printed articles and took notes
  • 10:00pm — used computer, connected to MIT wifi (Gmail)
  • 10:05pm — replied to 2 emails
  • 10:10pm — google search: academic calendar 2015
  • 10:15pm — wrote in 2 events in my planner, checked off to-do list items
  • 10:20pm — wrote in notebook
  • 10:30pm — checked mailbox, read letter
  • 10:35pm — mobile: sent an email
  • 10:40pm — baked cupcakes using water, oven, and electric mixer
  • 11:20pm — played Bananagrams (in Spanish and English) and Blockus

In total: received 36 emails and deleted 20; reblogged 2 Tumblr posts (7 more were posted from queue), liked 9, queued 16

Desi’s Daily Data

Sunday, February 8, 2014

I wake up to my alarm clock at 8:45am.

I receive text messages from a friend in the Netherlands and from my family. These texts include words and images; the phone also tracks the date and time that these messages were sent and read.

I take my daily birth control pill; the empty slots in the case demarcate which pills have already been taken.

I jot down my to-do list for the next few days, and mark items as I complete them.

I read and write emails. Gmail archives these sent messages and tracks the date and time sent and who received the emails.

I write a cover letter on Microsoft Word. My computer tracks the last time this document are saved.

I  read many articles on the internet. Various websites maintain analytics about my visit, including information such as timestamp, my location, length of visit, and IP address.

Through out the day, I blow-dry my hair, turn light switches on and off, power my laptop—just some of the many ways I use electricity. My apartment’s electric company keeps track of electrical expenditure in my apartment.

I go to a coffee shop and pay for tea. The cashier hands me a receipt, which is a record of the amount I spend (and pay in cash) and the time and date of my transaction.

Back at home, I check my email and see an alert that my bank has charged me for an ATM fee from two days ago.

I transcribe interviews for my thesis, making sure to include periodic timestamps in the transcript.

I browse on Twitter, marking certain tweets as “favorites.”

I call a good friend who lives several states away; my phone keeps a record of the date, time, and length of our conversation (57:26).

I input recent shared apartment expenses into Splitwise, a website that allows for easy tracking of bills amongst roommates and friends.

I update a few upcoming meetings and events in my calendar, noting time, date, location, and brief description of each engagement.

I write up this blog post, recording the happenings of my day. As soon as I click “publish,” WordPress will generate data about visits to this post.

Data Log (hsubrama)

This log runs from 7AM Saturday, 02/07/2015 to 7AM Sunday, 02/08/2015.

  • Snoozed alarm once and then woke up
    • My alarm clock app keeps track of my sleep schedule
  • Responded to a text message
    • My text conversations are stored in the iOS Messages app
  • Exercised
    • Treadmill recorded calories burned, speed, distance, etc.
    • Recorded exercise in fitness app, which collects data about my exercising and my meals
  • Showered and got ready
    • Water usage is recorded Cambridge Water Department
  • Did laundry and paid with MIT Tech Cash
    • Electric company records power consumption (of various electronic devices)
    • MIT records tech cash balance and transactions
  • Went to brunch and ate
    • Bon Appétit (provides MIT dining) records consumed food so they know what to cook and what to charge
    • MIT records that my ID card was used for food
    • I wrote a comment card because they served a brunch meal that I liked, Bon Appétit collects this information
  • Filled up water at filtered water fountain on campus
    • Fountain has LED display indicating total gallons of water poured
  • Worked on homework
    • MIT records connection to wireless networks (ex. MIT SECURE, MIT GUEST)
    • Google records some data from searches, calendar events, documents, email, etc.
    • Amazon Web Services monitors my running servers and collects various statistics on them
    • GitHub records my activity history
    • and all the other websites record information about my visit
  • Downloaded new podcast episodes
    • iTunes records my podcast downloads
  • Sent a few emails
    • Emails were archived
  • Left to go pick up gift for loved one, called a car to take me there
    • App records each time I call a car, and my location
    • Credit card company records charge when I pay
  • Picked up gift
    • Store security cameras record activity
    • Store records sales
  • Returned back to dorm and called brother
    • Phone company records call history
    • Swiped ID to enter dorm and use elevator, this is recorded by MIT
  • Worked on homework
    • Websites track my activity
    • Answered Piazza (MIT classes use this Q&A service) question and asked Piazza question – these were archived on Piazza
  • Factory reset corrupted tablet and reconfigured
    • Configuration info was sent to both HP and Microsoft
    • Downloaded apps are tracked by Microsoft
  • Listened to music
    • Spotify records my songs and various other data about my visit
  • Watched movie with friends
    • Netflix recorded our place in the video, when we started watching, and what we watched
  • Visited another MIT dorm
    • Front desk required ID, MIT recorded the access
  • Worked on homework
    • By downloading new C++ library, the hosting website recorded the download
  • Got ready for bed and made calendar events for next day
    • Google calendar recorded my calendar event and other data from my visit

Daily Data Log

Data log, starting at 12:20 am on February 6, 2015 and ending at 2:13 pm on February 6, 2015.

I am currently generating this document.
You are currently reading this document, probably in a browser which is keeping track of all of the sites you have visited in a while, on a computer that is continuing to send packets to a router to stay connected to the internet, and even more packets to a server when you refresh/load/interact with a page. This page is probably keeping track of how many visitors it has seen. Through you, I am generating data.

Before going to sleep, I wrote the following in a note (another generated document) on my phone’s S Memo app:
I charged my phone, consuming a fairly small amount of power that was recorded by Nstar, which provides my apartment with electricity.
I turned up the heat before going to bed; the gas used is also being recorded by Nstar.
Sleep time and duration could have been observed (by myself or an outsider) and recorded; before sleep, I set an alarm. Upon wake, I turned off an alarm after it rang twice. The music it emitted was also data that could have been collected.

I used an amount of water to brush my teeth.
I followed a schedule that exists online, using Google Calendar.
It took me a certain amount of time to walk to campus; my cell phone sent GPS requests to GPS satellites as I walked.
There may have been a cell tower handoff if I switched coverage zones; it is always possible to track me to within some radius if my cell phone is on.

I used a credit card to purchase a quesadilla for lunch at Anna’s.
My bank statement records most meals I have, and also most meals I miss.
My Firefox browsing history keeps track of all of the sites I have visited in the past year.
My current tabs (which are also available on my phone) keep track of what I’ve been interested in over the past week.
I googled some queries, which was recorded by Google.
I scribbled my homework solutions in a green notebook.

The Science Checks Out

A recent article from FiveThirtyEight, “Americans And Scientists Agree More On Vaccines Than On Other Hot Button Issues,” highlighted data from a 2014 Pew Research Center study on public attitudes towards science-related issues. In the graphic below, we can see how Democrats, Independents, and Republicans’ views compare with one another — and with those of scientists from the American Association for the Advancement of Science. This story was released in the wake of controversial statements made by Republican presidential hopefuls Chris Christie and Rand Paul that reignited the debate over whether or not children should be vaccinated. The public has much more of a (positive) consensus — both across the political aisle and with the scientific community — on the topic of vaccination compared to global warming, evolution, and GMOs.

Scientist Public-Split On Science-Related Issues

In the original opinion polls, approximately 65% of Independent and Republican respondents and 75% of Democratic respondents believed that all children should be required to be vaccinated, compared to about 85% of AAAS members. Given FiveThirtyEight’s brand of reporting, I would expect the intended audience of this graphic to be highly data-literate, and most likely closer to the scientific side of the spectrum. While the graphic isn’t explicitly partisan, it does highlight data suggesting that Republicans are less in agreement with the scientific establishment (though the responses to the GMO question invert this). As such, the graphic alone might play into a narrative about how Republican politicians like Christie and Paul are trying to pander to extremist, science-denying voters.

However, the article itself points out that plenty of the other potential Republican presidential candidates are pro-vaccination. Most voters, regardless of political affiliation, agree with science. The goal of this data presentation, then, might be to show how we’re not so different after all across the aisle — and with the exception of global warming, members of the public are more often in agreement with each other than with scientists. The graphics shown on the Pew summary and even their interactive tool doesn’t even mention politics, combining all respondents into a single group. If the goal was to show how far off Christie and Paul were in relation to the broader public (and science!), then this data presentation is effective. It demonstrates that their comments were anomalies and not representative of Republican voters.

One criticism I do have is that the line used to denote the scientists’ views is too bold, overpowering the actual tick lines. It might be misinterpreted as the 100% mark, making all the numbers seem higher than they really are. Even though there is a relative public consensus around vaccination, there is still a large number of people — a third of Republicans/Independents, a quarter of Democrats, and even a good number of scientists — who don’t believe that they should be mandatory. Another point is that there is a difference between believing that vaccinations are beneficial and believing that vaccinations should be mandatory — there are certainly other factors, such as one’s philosophy about the role of government, that are also operating in this data set.