Autorenbild.
16 Werke 1,079 Mitglieder 32 Rezensionen

Rezensionen

Succinct introduction to big data, its trends, applications, implications, shortcomings, and future. Enlightening anecdotes. No brain numbing gobbledegook!
 
Gekennzeichnet
harishwriter | 29 weitere Rezensionen | Oct 12, 2023 |
finally got back to this; might be a little dated, but still worth the read.
 
Gekennzeichnet
pollycallahan | 29 weitere Rezensionen | Jul 1, 2023 |
Great overview on the "big-data" heuristic approach, with many case study and a clear analysis of all its practical, philosophical end ethical implications.
 
Gekennzeichnet
d.v. | 29 weitere Rezensionen | May 16, 2023 |
This book has been sitting in my Kindle queue since publishing in 2013. Still holds up--but not a lot beyond talking at the high-level what can and has been done with big data. I didn't learn anything new.
 
Gekennzeichnet
auldhouse | 29 weitere Rezensionen | Sep 30, 2021 |
The book gave me a new appreciation as to how we can go about making decisions in the future. It also helped me understand a bit more about the debate whirling around the NSA and its desire to get all the information that it can. On the other hand, it reminded me that making decisions based solely on data can be dangerous, as in the case of Robert McNamara and the Vietnam War.
 
Gekennzeichnet
larrybenfield | 29 weitere Rezensionen | Jul 14, 2021 |
A pesar de llevar un tiempo desde su primera publicación, es un libro que ayuda ver y entender los datos masivos .

El los autores deberían actualizar el libro, desde su primera publicación han sucedido muchos ejemplos, con el uso que se ha dado a los datos masivos
 
Gekennzeichnet
meleont | 29 weitere Rezensionen | Jun 15, 2021 |
So much hype it's obscene. Heard this one before, except with different tech. Those who fail to learn history etc. or just show some restraint are doomed to overpromise and disappoint.
 
Gekennzeichnet
Paul_S | 29 weitere Rezensionen | Dec 23, 2020 |
Cukier, Kenneth & Viktor Mayer-Schönberger (2013). Big Data: A Revolution That Will Transform How We Live, Work and Think. London: John Murray. 2013. ISBN 9781848547933. Pagine 257. 12,99 €

Non si parla altro che di big data. E come spesso accade per le mode, se ne parla per lo più a sproposito. Questo è un tentativo piuttosto serio di esaminare diversi aspetti del fenomeno big data senza eccessi né nella direzione dell’entusiasmo né in quella dell’allarmismo.

È un libro molto importante, soprattutto per chi opera professionalmente nei campi della statistica, dell’economia, della sociologia e delle analisi quantitative: perché – al di là della fuffa – i big data cambiano radicalmente il mondo dell’analisi dei dati. O gli statistici (e gli uffici pubblici di statistica) cambiano radicalmente il loro modo di pensare e di operare, o saranno condannati all’irrilevanza. Capire come funzionano i big data è la chiave per comprendere il mondo in cui viviamo oggi. Niente di più, niente di meno.

Il libro è chiaro e ben strutturato in 10 capitoli. Alla fine, ho apprezzato più i primi – che trattano diversi aspetti rilevanti dei big data – degli ultimi – in cui emerge un po’ di misoneismo peloso. Ma andiamo con ordine, capitolo per capitolo (riferimento alle posizioni su Kindle].

1. Now. Dove si fa il punto su che cosa siano i big data e perché siano importanti.
One way to think about the issue today—and the way we do in the book—is this: big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more. [114]

There was a shift in mindset about how data could be used.
Data was no longer regarded as static or stale, whose usefulness was finished once the purpose for which it was collected was achieved, such as after the plane landed (or in Google’s case, once a search query had been processed). Rather, data became a raw material of business, a vital economic input, used to create a new form of economic value. [96]

The sciences like astronomy and genomics, which first experienced the explosion in the 2000s, coined the term “big data.” The concept is now migrating to all areas of human endeavor. [105]

Sometimes the constraints that we live with, and presume are the same for everything, are really only functions of the scale in which we operate. [181]

Yet the need for sampling is an artifact of a period of information scarcity, a product of the natural constraints on interacting with information in an analog era.
[...]
Big data gives us an especially clear view of the granular: subcategories and submarkets that samples can’t assess.
[...]
It’s a tradeoff: with less error from sampling we can accept more measurement error. [214-219]

Society has millennia of experience in understanding and overseeing human behavior. But how do you regulate an algorithm? Early on in computing, policymakers recognized how the technology could be used to undermine privacy. Since then society has built up a body of rules to protect personal information. But in an age of big data, those laws constitute a largely useless Maginot Line. People willingly share information online—a central feature of the services, not a vulnerability to prevent. [303]

2. More. Usare tutti i dati, non solo un campione.
As noted in Chapter One, big data is about three major shifts of mindset that are interlinked and hence reinforce one another. The first is the ability to analyze vast amounts of data about a topic rather than be forced to settle for smaller sets. The second is a willingness to embrace data’s real-world messiness rather than privilege exactitude. The third is a growing respect for correlations rather than a continuing quest for elusive causality. This chapter looks at the first of these shifts: using all the data at hand instead of just a small portion of it. [309]

The very word “census” comes from the Latin term “censere,” which means “to estimate.” [337]

Sampling was a solution to the problem of information overload in an earlier age, when the collection and analysis of data was very hard to do. [372]

Using all the data makes it possible to spot connections and details that are otherwise cloaked in the vastness of the information. For instance, the detection of credit card fraud works by looking for anomalies, and the best way to find them is to crunch all the data rather than a sample. The outliers are the most interesting information, and you can only identify them in comparison to the mass of normal transactions. It is a big-data problem. [439]

Reaching for a random sample in the age of big data is like clutching at a horse whip in the era of the motor car. [505]

3. Messy. Venire a patti con l’imprecisione.
Big data transforms figures into something more probabilistic than precise. [557]

“Simple models and a lot of data trump more elaborate models based on less data,” wrote Google’s artificial-intelligence guru Peter Norvig and colleagues in a paper entitled “The Unreasonable Effectiveness of Data.” [623]

It bears noting that messiness is not inherent to big data. [623]

The Billion Prices Project
[...] two economists at the Massachusetts Institute of Technology, Alberto Cavallo and Roberto Rigobon [...]
Price-Stats [655:668: web scraping per produrre il CPI statunitense]

4. Correlation. Correlazioni, predizioni e predilezioni.
Knowing what, not why, is good enough.
[...]
Correlations are useful in a small-data world, but in the context of big data they really shine.

In 1998 Linden and his colleagues applied for a patent on “item-to-item” collaborative filtering, as the technique is known. The shift in approach made a big difference. [800]

Because data was scarce and collecting it expensive, statisticians often chose a proxy, then collected the relevant data and ran the correlation analysis to find out how good that proxy was. But how to select the right proxy?
To guide them, experts used hypotheses driven by theories—abstract ideas about how something works. Based on such hypotheses, they collected data and used correlation analysis to verify whether the proxies were suitable. If they weren’t, then the researchers often tried again, stubbornly, in case the data had been collected wrongly, before finally conceding that the hypothesis they had started with, or even the theory it was based on, was flawed and required amendment. Knowledge progressed through this hypothesis-driven trial and error. And it did so slowly, as our individual and collective biases clouded what hypotheses we developed, how we applied them, and thus what proxies we picked. It was a cumbersome process, but workable in a small-data world. [857]

Our “fast thinking” mode is in for an extensive and lasting reality check. [1024]

In 2008 Wired magazine’s editor-in-chief Chris Anderson trumpeted that “the data deluge makes the scientific method obsolete.” In a cover story called “The Petabyte Age,” he proclaimed that it amounted to nothing short of “the end of theory.” The traditional process of scientific discovery—of a hypothesis that is tested against reality using a model of underlying causalities—is on its way out, Anderson argued, replaced by statistical analysis of pure correlations that is devoid of theory. [1147]

5. Datafication. Trasformare i fenomeni in dati.
The word “data” means “given” in Latin, in the sense of a “fact.” It became the title of a classic work by Euclid, in which he explains geometry from what is known or can be shown to be known. Today data refers to a description of something that allows it to be recorded, analyzed, and reorganized. There is no good term yet for the sorts of transformations produced by Commodore Maury and Professor Koshimizu. So let’s call them datafication. To datafy a phenomenon is to put it in a quantified format so it can be tabulated and analyzed. [1223]

His story highlights the degree to which the use of data predates digitization. [1206: la storia di Matthew Fontaine Maury]

It enabled information to be recorded in the form of “categories” that linked accounts. It worked by means of a set of rules about how to record data—one of the earliest examples of standardized recording of information. One accountant could look at another’s books and understand them. It was organized to make a particular type of data query—calculating profits or losses for each account—quick and straightforward. And it provided an audit trail of transactions so that the data was more easily retraceable. Technology geeks can appreciate it today: it had “error correction” built in as a design feature. If one side of the ledger looked amiss, one could check the corresponding entry. [1278: si sta parlando della partita doppia]

Over the following decades the material on bookkeeping was separately published in six languages, and it remained the standard reference on the subject for centuries. [1288: a proposito del capitolo sulla partita doppia del manuale di matematica di Luca Pacioli; peccato che gli autori scrivano che i Medici erano famosi commercianti e mecenati di Venezia!]

The standardization of longitude and latitude took a long time. It was finally enshrined in 1884 at the International Meridian Conference in Washington, D.C., where 25 nations chose Greenwich, England, as the prime meridian and zero-point of longitude (with the French, who considered themselves the leaders in international standards, abstaining). [1376]

The company AirSage crunches 15 billion geo-loco records daily from the travels of millions of cellphone subscribers to create real-time traffic reports in over 100 cities across America. Two other geo-loco companies, Sense Networks and Skyhook, can use location data to tell which areas of a city have the most bustling nightlife, or to estimate how many protesters turned up at a demonstration. [1426]

Twitter messages are limited to a sparse 140 characters, but the metadata—that is, the “information about information”—associated with each tweet is rich. It includes 33 discrete items. [1472]

It could tell if someone fell and did not get back up, an important feature for the elderly. [1493]

For well over a century, physicists have suggested that this is the case—that not atoms but information is the basis of all that is. This, admittedly, may sound esoteric. Through datafication, however, in many instances we can now capture and calculate at a much more comprehensive scale the physical and intangible aspects of existence and act on them. [1529]

6. Value. Come cambia il valore dei dati.
Data’s value shifts from its primary use to its potential future uses. [1565]

[...] data is starting to look like a new resource or factor of production. [1591]

Unlike material things—the food we eat, a candle that burns—data’s value does not diminish when it is used; it can be processed again and again. Information is what economists call a “non-rivalrous” good: one person’s use of it does not impede another’s. And information doesn’t wear out with use the way material goods do. Hence Amazon can use data from past transactions when making recommendations to its customers—and use it repeatedly, not only for the customer who generated the data but for many others as well. [1597]

In the end, the group didn’t detect any increase in the risk of cancer associated with use of mobile phones. For that reason, its findings hardly made a splash in the media when they were published in October 2011 in the British medical journal BMJ. [1709]

Google’s spell-checking system shows that “bad,” “incorrect,” or “defective” data can still be very useful. [1771]

A term of art has emerged to describe the digital trail that people leave in their wake: “data exhaust.” [1785]

[...] “learning from the data” [...] [1789]

“We like learning from large, ‘noisy’ datasets,” chirps one Googler. [1796]

“Data is a platform,” in the words of Tim O’Reilly, a technology publisher and savant of Silicon Valley, since it is a building block for new goods and business models. [1930]

7. Implications. Fare affari con i dati.
In the previous chapter we noted that data is becoming a new source of value in large part because of what we termed its option value, as it’s put to novel purposes. The emphasis was on firms that collect data. Now our regard shifts to the companies that use data, and how they fit into the information value chain. We’ll consider what this means for organizations and for individuals, both in their careers and in their everyday lives.
Three types of big-data companies have cropped up, which can be differentiated by the value they offer. Think of it as the data, the skills, and the ideas.

Hal Varian, Google’s chief economist, famously calls statistician the “sexiest” job around. “If you want to be successful, you want to be complementary and scarce to something that is ubiquitous and cheap,” he says. “Data is so widely available and so strategically important that the scarce thing is the knowledge to extract wisdom from it. That is why statisticians, and database managers and machine learning people, are really going to be in a fantastic position.” [1976]

Mathematics and statistics, perhaps with a sprinkle of programming and network science, will be as foundational to the modern workplace as numeracy was a century ago and literacy before that. [2261]

Rolls-Royce sells the engines but also offers to monitor them, charging customers based on usage time (and repairs or replaces them in case of problems). Services now account for around 70 percent of the civil-aircraft engine division’s annual revenue. [2315]

8. Risks. Il lato oscuro.
The dataset, of 20 million search queries from 657,000 users between March 1 and May 31 of that year, had been carefully anonymized. Personal information like user name and IP address were erased and replaced by unique numeric identifiers. The idea was that researchers could link together search queries from the same person, but had no identifying information.
Still, within days, the New York Times cobbled together searches like “60 single men” and “tea for good health” and “landscapers in Lilburn, Ga” to successfully identify user number 4417749 as Thelma Arnold, a 62-year-old widow from Lilburn, Georgia. “My goodness, it’s my whole personal life,” she told the Times reporter when he came knocking. “I had no idea somebody was looking over my shoulder.” [2438]

“It isn’t the consumers’ job to know what they want,” he famously said, when telling a reporter that Apple did no market research before releasing the iPad. [2654]

“It is true enough that not every conceivable complex human situation can be fully reduced to the lines on a graph, or to percentage points on a chart, or to figures on a balance sheet,” said McNamara in a speech in 1967, as domestic protests were growing. “But all reality can be reasoned about. And not to quantify what can be quantified is only to be content with something less than the full range of reason.” [2662]

9. Control. Servono nuove regole.
Changes in the way we produce and interact with information lead to changes in the rules we use to govern ourselves, and in the values society needs to protect. [2695]

Rather than a parametric change, the situation calls for a paradigmatic one. [2717]

With such an alternative privacy framework, data users will no longer be legally required to delete personal information once it has served its primary purpose, as most privacy laws currently demand. This is an important change, since, as we’ve seen, only by tapping the latent value of data can latter-day Maurys flourish by wringing the most value out of it for their own—and society’s—benefit. [2746]

Without guilt, there can be no innocence. [2804]

10. Next. Il futuro dei big data.
Solo una sintesi dei capitoli precedenti. Ma non si poteva non fare.
 
Gekennzeichnet
Boris.Limpopo | 29 weitere Rezensionen | Apr 29, 2019 |
The book starts with the story of a company named Farecast in 2003. Oren Etzioni at the University of Washington is on an airplane. He decides to ask other passengers how much they paid for the seats. And it turns out that one person paid one fare, and another person paid another fare. Of course, this made Oren really upset. And the reason why is that he took his time to book the air ticket long in advance, assuming he was going to pay the lowest price. And he started thinking – if only I knew what stays behind airfares. How would I know if a presented price at an online travel site is a good one or a bad one? And then he came up with the insight. Because he was a computer scientist, he realised this is just an information problem. Next, his decided to collect the flight price record of every single flight in commercial aviation in the United States for every single route and to identify how long in advance the ticket was bought for the departure, and what price was paid. Based on that, he created a system for which a major goal was making predictions on whether the price is likely to rise or fall. And eventually, it worked pretty well. Then decided to get more data. So he did and got more data until he had 20 billion flight-price records that fuelled his predictions. The system saved a lot of customer’s money, and then Microsoft came and bought his system for $100 million.

This data of the airfares have become the raw material for a new business, and a new source of value, and a new form of economic activity. And right now many companies, Google, Amazon, Apple, eBay, IBM etc. are running a new economy with this data. It is a fuel of the information economy.

Now a few things about more data. We’re going from an environment where we have always been information-starved. We have never had enough. And right now we live in the world where that is no longer the operative constraint. Although we never have all the information, the book gives a few great examples of why more data is better than clean data. Another striking example regarding the conflict between the quality and quantity of data related to Amazon. So when... (if you like to read my full review please visit my blog: https://leadersarereaders.blog/2018/12/13/bigdata/)
 
Gekennzeichnet
LeadersAreReaders | 29 weitere Rezensionen | Feb 19, 2019 |
A good primer for starting off in the Big Data field.
Examples are interesting, although slightly repetitive.
The biggest take away for me is to look for correlation rather than causality.

All in all, an easy read with over 30 pages of references for those who want to delve deeper.
 
Gekennzeichnet
MickBrooke | 29 weitere Rezensionen | Jan 2, 2019 |
Megbüntethetnek-e minket olyasmiért, amit el sem követtünk?
Hogyan kémkednek utánunk az okos elektromos mérőórák?
Milyen összefüggés van a hurrikánok és a félkész epres sütemény eladása között?
Milyen színű használt autót érdemes vásárolnunk?
Hogyan leplezhető le a sportban a bundázás?

A válaszokat a big data módszere, vagyis az óriási adattömeg rendszerezése és elemzése rejti, amely révén meghökkentő következtetésekre juthatunk. Nincsenek többé titkok: az adatok mindent elárulnak az életünkről.
A big data forradalma a következő években megváltoztatja majd az üzleti világgal, az egészségüggyel, a politikával, az oktatással és az innovációval kapcsolatos gondolkodásmódunkat, és alkalmazása az élet egyre több területén elterjed. Helytelen felhasználása azonban új veszélyeket is hordoz: fenyegetheti például a személyiségi jogokat, sőt még az is előfordulhat, hogy valakit olyasmiért ítélnek el, amit el sem követett, egyszerűen azért, mert a big data képes előre jelezni a jövőbeli viselkedést.
Ebben a kristálytiszta logikával felépített, rengeteg meglepő ismeretet tartalmazó könyvben két zseniális szakember magyarázza el, mi is a big data, hogyan változtatja meg életünket, és mit tehetünk azért, hogy megvédjük magunkat a veszélyeitől.
 
Gekennzeichnet
bukkonyvtar | 29 weitere Rezensionen | Oct 10, 2016 |
Excellent overview of data science, data collection and analysis. Includes interesting case studies. Drones on a bit in the last chapter, but you have to end it somehow.
 
Gekennzeichnet
ndpmcIntosh | 29 weitere Rezensionen | Mar 21, 2016 |
A very important book, clear easy to read, including many examples to illustrate big data cases from real life! It is highly recommended to both technical and non-technical readers to understand the nature of the future we are facing, and the power of data that will role our future lives! hopefully we can fight failing under the dictatorship of data

Some Reading Notes:

* Google does it. Amazon does it. Walmart does it. And, as news reports last week made clear, the United States government does it.
* Google published a paper in Nature claiming that they could predict the spread of flu having analyzed 50m search terms and then run 450m different mathematical models. In 2009, their model was more accurate and faster at predicting the spread than government statistics.
* Oren Etzioni of Farecast took big data files of airline ticket prices relative to days before the flight – so it was able to calculate the optimum time for flight purchase. it crunches 200bn flight price records to make its predictions, saving passenger an average of $50 a flight. Microsoft eventually bought the company for $110m and integrated it into Bing.
* Amazon uses customer data to give us recommendations based on our previous purchases. Google uses our search data and other information it collects to sell ads and to fuel a host of other services and products.
* Why spread such a huge net in search of a handful of terrorist suspects? Why vacuum up data so indiscriminately?
* the new thinking is that people are the sum of their social relationships, online interactions and connections with content. In order to fully investigate an individual, analysts need to look at the widest possible penumbra of data that surrounds the person — not just whom they know, but whom those people know too, and so on.
* big data analytics are revolutionizing the way we see and process the world Similar to Gutenberg printing press
* Data is growing incredibly fast — by one account, it is more than doubling every two years! as storage costs plummet and algorithms improve, data-crunching techniques, once available only to spy agencies, research labs and gigantic companies, are becoming increasingly democratized.
* There has been a dramatic increase in the amount of data. When the Gutenberg press was invented there was a doubling of information stock every 50 years. Information now doubles every three years. Big data analysis has been made possible by three technological advances: increased datafication of things, increased memory storage capacity and increased processing power.
* The advantage of ‘n=all’ is it shows up correlations that would not appear under normal circumstances.
* Correlation does not equal causation
* The film Moneyball depicts how big data outperforms human instinct. Rather than relying on ‘intuition’ and ‘experience’ (which is often unconsciously influenced by other irrelevant factors – like in this case a player’s girlfriend or his swing), it relies on amassing a lot of clean data that has been uninfluenced by emotions and prejudice.
* Big data is often messy and incomplete (only 5% of all data is structured). But the sheer scale of data compensates for this lack of precision e.g. in a Vineyard, measuring the temperature with just one sensor once a day will be much less accurate than 100 sensors taking readings every minute.
* Big data will increasingly be used as the primary default mechanism for many decisions as it increases accuracy and reduces irrelevant influences. Eric Brynjolfsen at MIT’s sloan school found that companies that excelled at data driven decision making found their productivity was 6% higher than those who do not emphasize empirical judgment making.
* Albert-Laszlo Barabsi analyzed all the phone calls of 1/5th of a country’s mobile phone calls over a four month period. He discovered that people with a lot of links, were not the ultimate ‘connectors of groups, but it was those on the outside of groups (who connect between different groups) were the key to information transfer across a network.
* Datafication is the unearthing of data from seemingly undatafiable sources. The reality is these days almost anything can be datafied – from pressure points across a retail floor, through to measuring sleep patterns via our mobil phones. In 2009 Apple were granted a patent for collecting blood oxygenation levels, heart rate and body temperature from the earbuds connected to their iPhones. Likewise, GreenGoose have developed tiny movement sensors that can be put onto a pack of dental floss providing real behavioural data as opposed to claimed usage.
* We are also seeing the datafication of people, and their relationships. Facebook’s ‘likes’ have datafied sentiment but the rich data of all the personal interconnections provides a great source of analysis – Facebook’s user base of 1bn represents 10% of the entire world’s population – no other database has as much information about people and their interconnections. Whilst they have been very cautious about exploiting this data, the information that lies within it will have help us understand people, their relationships and societies.
* One of the biggest opportunities is the using of data for secondary uses.
* There appears to be four ways of polishing the data diamond: Reuse (i.e allowing other people to see your data); Merging (i.e cross comparing a number of different data sets as sometimes the real value of data is best released when combined with other data), Combining (i.e. where build bigger data platforms from a number of different sources) and Twofers (i.e. The primary owner of the data finds a secondary use of the data)
* There are three groups who are at the heart of the development of big data: The data owners; the data analysts (who convert it into usable information) and finally the big data entrepreneurs (who spot new uses that other people are blind to).
* The essential point about big data is that change of scale leads to a change in state. It is now possible to even judge (and punish) people before the very act has taken place.
* The 2012 Obama campaign used sophisticated data analysis to build a formidable political machine for identifying supporters and getting out the vote
* New York City has used data analytics to find new efficiencies in everything from disaster response, to identifying stores selling bootleg cigarettes, to steering overburdened housing inspectors directly to buildings most in need of their attention
* Dark Side of Big Data : “The ability to capture personal data is often built deep into the tools we use every day, from Web sites to smartphone apps,” the authors write. And given the myriad ways data can be reused, repurposed and sold to other companies, it’s often impossible for users to give informed consent to “innovative secondary uses” that haven’t even been imagined when the data was first collected.
* Dark Side of Big Data: Like in Minority Report Movie , predictions seem so accurate that people can be arrested for crimes before they are committed. In the real near future, Big Data may bring about a situation “in which judgments of culpability are based on individualized predictions of future behavior.”
* Big Data will employ “predictive policing,” crunching data “to select what streets, groups and individuals to subject to extra scrutiny, simply because an algorithm pointed to them as more likely to commit crime.”
* probabilities can negate “the very idea of the presumption of innocence.”
* Too Much Data! When the Sloan Digital Sky survey began in 2010, there was more data in the first few weeks than had been collected over the entire history of astronomy. By 2010 is had amassed 140TB of information. In 2016 a new telescope will come on stream which will acquire the same amount of data in just 5 days. Likewise, the CERN particle physics laboratory in Switzerland collects less than 0.1% of the information that is generated during its experiments.
* When scientists first decoded the human genome in 2003, it took a decade of work to sequence 3bn base pairs. Now a single facility achieve the same in just one day.
* On the stock market, 7m stocks are traded each day. Google produces more than 24 petabytes of data a day. Facebook gets 10m new photos loaded up every hour. There were 400m tweets a day in 2012.
* Big data is not the magic elixir for everything. Robert McNamara (US secretary of defense during the Vietnam war) believed he could tame the complexity of war via analysis of big data – but as history has shown, such a strategy was flawed.
* Big Data is relying on the numbers when they are far more fallible than we think. They point to escalation of the Vietnam War under Robert S. McNamara (who served as secretary of defense to Presidents John F. Kennedy and Lyndon B. Johnson) as a case study in “data analysis gone awry”: a fierce advocate of statistical analysis, McNamara relied on metrics like the body count to measure the progress of the war, even though it became clear that Vietnam was more a war of wills than of territory or numbers.
* Book References: "Who Owns the Future?" , "The Signal and the Noise"
* The danger of failing under the dictatorship of data
 
Gekennzeichnet
eknowledger | 29 weitere Rezensionen | Feb 28, 2016 |
A very good non-technical analysis of Big Data and how this new wave is transforming not only businesses but also personal lives. One interesting insight, in the Big Data world, 2 2 need not be 4, even 3.8 or 3.9 is good enough. The amount of data being processed is so vast that it is next to impossible to fathom the exact mechanics of how an airline sight is consistently able to give fares that 5 to 10% less than the competition. The very fact that it does is good enough.
 
Gekennzeichnet
danoomistmatiste | 29 weitere Rezensionen | Jan 24, 2016 |
A very good non-technical analysis of Big Data and how this new wave is transforming not only businesses but also personal lives. One interesting insight, in the Big Data world, 2 2 need not be 4, even 3.8 or 3.9 is good enough. The amount of data being processed is so vast that it is next to impossible to fathom the exact mechanics of how an airline sight is consistently able to give fares that 5 to 10% less than the competition. The very fact that it does is good enough.
 
Gekennzeichnet
kkhambadkone | 29 weitere Rezensionen | Jan 17, 2016 |
This book is a mixed bag. The information contained therein is fantastic, but the way it's laid out is not. Interestingly, the well-written summary could have replaced much of the awfully repetition-ladened boring exposition. I took delight in the concrete examples of actionable data analysis the book offered. It was those nuggets I was looking for while sifting through the sand. It's a concise book that could have been even more concise, because the the information gleaned could easily have been pared down, by an excellent editor, to a long article. Not a regrettable read, however. This stuff is the wave of the future.
1 abstimmen
Gekennzeichnet
MartinBodek | 29 weitere Rezensionen | Jun 11, 2015 |
These authors argue that in the era of Big Data, the idea of privacy is obsolete. We click, we search, we call, we charge it, and computers can process all that data faster than we imagine. While it takes the CDC doctors weeks to verify the outbreak of a flu epidemic, Google can detect the increase the the number of flu related Internet searches almost immediately! Their ability to categories and even identify people with the immensity of anonymous data will jolt the reader. Their enthusiasm comes through in the writing, whether you are a businessman hoping to use Big Data or a citizen concerned about the changing digital landscape.
 
Gekennzeichnet
ktoonen | 29 weitere Rezensionen | May 29, 2015 |
ignore the Italian subtitle- the book is more balanced in its discussion about what Big Data is, and its potential impacts- not technical, but sharing enough cases to discuss some minutiae that often are forgotten

now a couple of years old, it is still relevant for most of its content, and worth reading as it extends beyond the mere Big Data, and embraces also mobile devices, the "Internet of Things", privacy and policy issues

actually: it could be recommended reading for both politicians and (non-technical) senior management within the private sector- to be able to understand beyond the self-appointed proponents of yet another "management silver bullet"
 
Gekennzeichnet
aleph123 | 29 weitere Rezensionen | May 1, 2015 |
Awesome book indeed – This is an excellent summary of how big data affects us and therefore how the shape could be in the future ! All examples given and Big Data use cases are very practical and will definitively help people to get real picture without basic knowledge on the subject. Great analysis and this book should be read by anybody who wants to understand the Digital Age and beyond !

March, 1st - 2015
 
Gekennzeichnet
Fouad_Bendris | 29 weitere Rezensionen | Mar 1, 2015 |
I thought this book was a good introduction to an increasingly important topic. It is a readable book and dig under the surface and there is a helpful model of approaching the implications of digital memory. There is a good balance between the issues, a conceptual framework and the pros and cons of some possible solutions. Only slightly let down by examples that were too extreme or simplistic.
 
Gekennzeichnet
culturion | Dec 2, 2014 |
Big data, one of recent years' new buzzwords, has now gotten itself a book with said title. Mayer-Schönberger and Cukier's "Big data: a revolution that will transform how we live, work and think" focuses mostly on what businesses can do with big data, and you ain't gonna find no much material as a technological-oriented data scientist. The book is from 2013 and already seems dated in the light of the Snowdon revelations. The authors critique of personal big data collection does not mentions the dragnet operations of signal intelligence agencies besides an 8-line William Binney-paragraph.

The authors claim three features of big data ("three major shifts of mindset"): "More", messy and correlation rather than causality. I am not entirely convinced that these features distinguish big data. Interventional A/B-testing seems at least to some degree to probe causality rather than just correlation. Such tests are continuously done by major Internet companies on unsuspecting users on large scale. Thus I would say big data processing is indeed probing causality. I neither agree that the big data is more messy than old-time small data. Anyone working seriously with small data may easily find the handling of such data can be a considerable headache and require some processing and 'understanding'. Indeed big data technologies have brought us means for handling messy data in a more structured way (JSON, NoSQL, Semantic Web, Wikidata). The reason why small data may feel less messy could be because the clean-up of small data can be done manually in a spreadsheet by a non-programmer, while for big data you need automatic tools and probably a programmer.

The authors also claim that we will see a rise in the profession called 'the algorithmist' whose job it will be to review algorithms. I do not think this is likely. The closest will probably get is the Google advisor board on the 'right to be forgotten'.

The authors also fails to give us a proper critique of big data hype: Their initial example on Google Flu Trends is dated: A publication from March 2014 shows a wrong flu prevalence estimation from Google Flu Trends (see 'The Parable of Google Flu: Traps in Big Data Analysis'). The Zeo EEG big data ZEO mentioned in the book hailed back in 2013 as one of the "8 Best Sleep Tracking Apps and Devices" has run out of money, is 'out of business' and you won't find a response from www.myzeo.com.

While the authors tell us that companies collect vast amount of data and that "Companies may be powerful" they ensure us on page 156 that the companies "don't have the state's powers to coerce". Well, yes. But the states have the ability to coerce the company to hand over any personal data. Indeed U.S. companies are coerced to hand over oversea data. Loretta A. Preska of the United States District Court told that to Microsoft. And within the U.S. PRISM program the handover is determined in secret FISA courts.

But in general there is a good allround discussion of the issues of Big Data, e.g., the notion of "collect everything" for not necessarily a predefined purpose. It will be interesting to have the opinions of the authors in light of the Snowdon issues.
 
Gekennzeichnet
fnielsen | 29 weitere Rezensionen | Aug 5, 2014 |
El concepte de Big Data, o dades massives, es refereix a la quantitat ingent de dades de les que disposem avui en dia, gràcies sobretot a les noves tecnologies i internet, sobre tots els aspectes de la nostra vida. Juntament amb l'abaratiment progressiu del cost d'emmagatzematge i l'increment de la potència de processat, això dóna lloc a la possibilitat de realitzar anàlisis molt més amplis, que poden fer aflorar correlacions inesperades.

La idea principal que explica el llibre és molt interessant. Desafortunadament, com tants autors tècnics, repeteix un cop i un altre, dit d'una altra manera; la mateixa informació. Un llibre que podria ser un article molt potent, es torna pesat i redundant. Ajuntat amb un estil una mica adoctrinador, fa que el final sigui realment insuportable.Una llàstima...½
 
Gekennzeichnet
arnautc | 29 weitere Rezensionen | Apr 22, 2014 |
Authors explore the role of data, lots of it, in our world. From Google to financial institutions, data is being collected by the terabyte. Instead of looking at causation, analysis of data finds correlations to predict actions. Interesting read, a bit repetitive with the examples and the content. Somehow I feel the authors missed key points, or rather, didn't accent them in a way that made the reader understand their significance.
1 abstimmen
Gekennzeichnet
addunn3 | 29 weitere Rezensionen | Jan 2, 2014 |
Cet ouvrage nous donne l'avis de deux grands experts du Big Data. Ils nous éclairent sur la réalité de ce phénomène, les conséquences, ainsi que sur les bons, et mauvais côtés de l’utilisation des données Big Data.
 
Gekennzeichnet
Rihet | 29 weitere Rezensionen | Dec 4, 2013 |