Editorial note: The HPC Impact Showcase highlights real-world applications of high performance computing (HPC) at companies who are currently employing HPC to advance their competitiveness and innovation in the global marketplace. Rather than a technical deep dive of how they are using or managing their HPC environments, their stories are meant to tell how their companies are adopting and embracing HPC as well as how it is improving their businesses. The full schedule is available by clicking here.
Platforms like Facebook and LinkedIn use social graphs to help us find personal connections – however, at Ancestry, we were able to build a family history graph that reveals the complicated connections between billions of people, location, and tens of thousands of historical events.
Through the power of machine learning, Ancestry, with the world largest family history and consumer DNA database, is creating its Big Tree, a knowledge graph that stitches together 10 petabytes of structured and unstructured data from 18 billion records, more than 80 million family trees and eight billion people into one Big Tree. Utilizing the scalability of cloud computing, artificial intelligence, big data technology, AI based Search technology, and distributed stitching engines, the Big Tree is updated in real time – at a rate of 400 changes per second and 35 million changes a day – as users input data or make edits, becoming more powerful and unearthing new knowledge of familial connections with every update.
In this session, data scientists at Ancestry will present an informative explanation of how the company is leveraging its Big Data and machine learning capabilities to stitch together the largest family history graph, and how it will impact and reveal the organic relationships between people, locations and events, which could prove a powerful force for greater empathy and understanding as we understand how we relate to the rest of humanity.
About the Speakers:
Atanu Roy, Ph.D. is a Senior Data Scientist at Ancestry involved in developing the “big” genealogical tree that is formed by attaching the common ancestors of millions of customers. This process involves a pair matching algorithm which in Machine Learning parlance is known as the Entity Resolution problem. Prior to Ancestry, Atanu worked for Riot’s League of Legends on social network analysis and at Cognizant Technology Solutions as a programmer analyst where he extracted, transformed, and cleaned large (multi-TB) transactional databases for U.S. based clients. Atanu received his Ph.D. from the department of Computer Science & Engineering at University of Minnesota – Twin Cities.
As a part of his doctoral degree at UMN, he was actively involved in first hand research in various machine learning techniques and its various application fields like social network and big data analytics. He has designed, implemented and tested various scalable machine learning, data mining and information retrieval models. Atanu’s research interests span in the areas of data mining and machine learning and its applications in the areas of social network analytics and big data analytics. His work is published as several academic research papers that are accepted in well-known conferences and journals such as ACM TOIT & Journal of Advertising.
Tyler Folkman has been a data scientist at Ancestry since May 2016. Before that, Tyler received his Master’s in Computer Science from the University of Texas at Austin, where he studied statistical machine learning, data mining and AI. Previously, he has worked on statistical modeling and real-time visualization at Amazon, Cognitive Scale, and Charles River Associates. Tyler likes to spend his time exploring novel data sets and posting what he learns on www.learningwithdata.com.