Big Data

A Data Driven Future

There is a growing awareness of the power of data to be a lever with which humanity can change the world for better. This is serious stuff. Save lives. Heal our planet. Generate immense real economic and improve the quality of life of nearly every living human being. The application of data science to the rising tide of data inundating our lives is an important part of achieving these benefits for humanity.

All segments of all industries must rapidly adapt to a data driven economy. People and companies that bury their heads in the sand will only to fall behind with greater velocity over time. Technological change is not only happening faster, it's velocity of change is increasing and thus the magnitude of its impact, positive and negative, in ever shorter time cycles. Feel like things are changing faster? They are changing faster.

When it comes to data and the science we want to do with that data, there is no one pill to cure all our ills. There is no easy button even though that is what everyone mistakenly wants. To learn how do use data to change the world, we need to understand it’s core properties.

There are four things to know about data that stand out as especially important.

Data is an Asset. It is like real estate, a bond, cash in the bank or an automotive production plant. When you begin to consider data in this way, it changes how you choose to manage it over time. It changes how you assign value to data.

Data is Digital. Data is unique in many ways because it is digital. This gives it, generally speaking, a low marginal cost of replication and reuse. This can be a blessing or a curse depending on how this little detail is managed and used as a lever to generate value from data.

Data is Strategically Valuable. As a digital asset, data is strategically valuable to the on-going concern of any entity be they a person or an organizations. This just seems obvious, and gladly, it’s becoming more so now that we have the actual tools becoming available more broadly to use the data we can generate.

Data is Dynamic - Data will never stop changing. Even if you just leave it alone it'll rot or age. Some data has a very short half-life of value. Other data, like a fine wine, just gets better with age (and integration). If your data management systems do not account for this, you'll end up with a data cesspool instead of a pristine and beautiful data lake.

Understand these four key properties of data. Consider the implications of five Billion people online, connected and communicating. Remember that data can be used to change the world. Think. Then, Do Good Things With Data for People.

There will be between 50 and 200 Billion devices connected to the internet by around 2020. At the high end, that’s as many as 25 devices for every person that will be alive at that time. Today, that number is closer to three for every living person. This is currently called the Internet of Things. People are a crucial component of these things. People don’t like being called things and that is understandable. Like it or not though, people are part of the IoT.

Over half of humanity is not online yet! But, they will be very soon. Right now there are about 3.15 Billion people online in various ways. That will be over five Billion in the next three to five years.

Access to massive data sets and humanities ability to use them effectively is astounding. There are algorithms being created that are learning to do things that only humans could do before and they are learning them in very human-like ways in some cases. They are usually carefully trained and parented by a loving data scientist still today.

The subjects touched on lightly in this post are beginnings.

---

About the Author
Kent Langley is the CEO/CTO, Ekho, Inc. and Faculty in Data Science at Singularity University. Kent advises companies and frequently acts as a Chief Technical Advisor to the business and technology executives providing for Technology audit, due diligence, technical architecture and filling leadership roles on-demand. Kent is also an ExO expert helping companies adopt new people, processes and technologies that enable them to leverage resources effectively and grow.

About Ekho
Ekho is a company that endeavors to deliver on the Massive Transformative Purpose (MTP) to do good things with data for people. Ultimately, Ekho helps its clients derive actionable insights from their data using the best data science tools and processes available.

About Singularity University
Singularity University provides educational programs, innovative partnerships and a startup accelerator to help individuals, businesses, institutions, investors, NGOs and governments understand cutting-edge technologies, and how to utilize these technologies to positively impact billions of people.

Moving at the Speed of Cloud

The majority of my work in the last three years or so has been all about receiving, getting, pushing, pulling, and generally wrangling streams of data (mostly social data) for the purposes of analytics, comparison, or saving across a broad range of products and services for startups (one of my own) and fortune 500 companies. It's been keeping me busy. All of this for the ultimate reason of helping businesses make better and more well informed decisions about products, services, and more.

During this time I and my colleagues have developed the relationships, partnerships, technology stacks, and processes necessary to deliver these types of applications very quickly and at a high quality level. This has been fun all in all and something for which demand seems to be growing quickly.

To give a sense of the technology "stack" I've mostly settled on for solving these types of problems we are using:

Languages: Scala, Java, Node.js, PHP, Ruby

Frameworks: Symfony2, Play2.0, express.js, twitter bootstrap

Data Store: MySQL, MongoDB, Riak, Redis

Infrastructure: Amazon Web Services

Orchestration: Chef, Custom Scripting, AWS Cloud Formation

That's just a high level snapshot of course, there are a lot of details down inside each of those items from favored libraries to DB clients, and configuration management frameworks.

The best part for me is that it seems like for the first time in a long time many buisinesses seem to understand and believe in the value of the application of technology to solving business problems as a first order task.

The drive for big data aggregation and analytics is a natural evolution of the the maturation of cloud computing as both a technology and a service/process. The continued evolution of programming languages, application frameworks, and even the general understanding of distributed service oriented architectures and how to program REST API's is all improving as such an incredible rate that it's just an awesome time to be creating software.

So much of what we are doing now has been "around" in one form or another for a long time. The science in computer science laid the foundations quite some time ago. It's only now that so much is becomming so  accessible and the information on how to use all these tools is readily available.

I read a recent article/survey posted to Forbes.com that said the cloud is still three years away from it's full impact. The first cloud camp, where I did a session on developing for the cloud, was in 2008. That's only four years ago and look how much has changed! Awesome. 

From where I sit, this is an exciting time with nearly unlimited possibilties. Ideas are critical. Exececution is just as important. If you want to talk about any of these things I'm usually found either in San Francisco or San Rafael so let's chat! Good times!!