That’s a word cloud. It took me about 30 seconds to build.
It shows the relative word frequencies in the titles of posts to r/dataisbeautiful over the last 7 days.
Two weeks ago the UK’s Office of National Statistics (ONS) announced that the UK economy had grown by 0.9% in the second quarter of 2014. This growth was partly due to the decision to include prostitution in gross domestic product (GDP) for the first time ever.
The ONS estimate that Britain’s 60,879 prostitutes contribute £5.314bn (0.4% GDP) to the UK economy. But, this figure completely ignores male prostitution. In a previous post we used data from AdultWork, a popular sex worker website, to determine that 42% of UK prostitutes are male, which (all other factors being equal) represents an additional £3.542bn of UK GDP!
Our initial analysis was very simple, but nonetheless raised a lot of interest from you all, along with some great questions about the gender differences between male and female sex workers. We decided to look a little deeper into the gender differences amongst sex workers online and this is what we found.
How many prostitutes are there in the UK? According to the Office of National Statistics the answer is 60,879. In figures due to be released next week this number is being used to add £5.314bn to the official size of the UK economy. But the ONS only attempted to measure the number of female prostitutes. If male prostitutes are included in the count then the contribution that prostitution makes to the UK economy rises to £8.856bn.
From the 30th of September the UK’s national accounts will attempt to measure illegal transactions to which all parties consent, including the sale of illegal drugs and prostitution. Illegal transactions are difficult to measure because the participants, while willing, are anxious that their business goes unnoticed. As a result there are very few obvious ways to directly measure illegal transactions and the ONS have been forced to rely on 10-year-old survey data in order to try and estimate the level of prostitution activity in the UK.
But there is a better way of measuring the number of prostitutes than using survey data. While many of the activities associated with prostitution are illegal in the UK, paying for sex is actually legal and as a result, prostitution services are widely marketed on the web. We can use Import.io to directly count the number of prostitutes who are marketing on the web and attempt a better estimate of the number of prostitutes in the UK.
Every iPhone release takes its own special path through the hype cycle.
Before a release the rumours start about what features the phone will have. The hype builds to a frenzy that peaks with a glittering presentation of the new device and pictures of queues, as fanboys line up to be the first to try the new product.
We will get over it though and before you know it we will be asking ourselves “when is the next release due?”
The Scottish referendum is happening today.
The political debate has all of a sudden erupted into noise and confusion and it made me wonder how people should vote.
I can only do this once: I am going to try and write down my first impressions of San Francisco and Silicon Valley.
I must admit that this is not my first time in the Bay area on business but it is the first time that I am here with a view to establishing a permanent home for our company and that brings with it a different mindset that I hope to be able to share with you.
A little bit of background first. My name is Andrew Fogg. I am a co-founder of two technology businesses in London: Kusiri and import•io. I am spending the summer in San Francisco, leading import•io into Silicon Valley. My objective is to grow the business here, to help secure further financing for the company and to learn from those that have gone before about how to best build a successful technology company.
Over the coming weeks I will give you an insight into how things are done out here, how it is different to London and what can be learnt from the Valley. So without further ado, this is my view of Silicon Valley.
HTML and other related Web technologies were initially developed to allow for the publication of natural language documents on the Internet, for reading by humans.
The key phrase in the previous sentence is “for reading by humans”.
Humans are very good at reading natural language documents and interpreting meaning. Computers on the other hand, while they are good at serving these documents, are very bad at reading them and interpreting them. Computers require precise instructions: if you want a computer to do something for you, it is better that you have data rather than documents.
The standards for serving Web documents have been widely adopted and they power the Web that we know and love today. By contrast, while there are proposed standards for serving Web data, these standards have not been widely adopted: less than 1% of websites use RDF.
The web is a wonderful place for information. I can open a browser and have the answer to any question within minutes. But the web is not so great when it comes to data. Getting data from the web is difficult. And the only solution is a bit of a dirty secret for our industry, a dirty secret that we don’t like to talk about…”web scraping”.
The reality is that if you are a data owner with a data source on a website, then that data source is almost certainly being scraped, today. You have no insight into this. You have no control over it. It is just a cost for you. This is not good.
Web scraping is also not good for data users. It is high cost as it requires expensive developer time. The rights that you have to use the data are uncertain: do you need to hide the fact that you are scraping? do you need to combine the data with other data before you can use it? If you need multiple data sources then you create a data integration problem for yourself: you have to normalise and integrate the results of multiple web scrapes. Even if you tried to pay the data owner for access to their data source, they probably wouldn’t be able to take the money off you as they are not in the business of selling data.
In summary, getting data is a problem and web scraping is neither a good technical solution nor a good economic solution.
Import•io is a place where data users (people who want data) and data owners (people who have data) can better interact. It is a platform upon which connectors to data sources can be built along with a suite of tools to make it easy to build connectors to either APIs or web data sources.