Bangalore,22nd May 2023: Uber’s India engineering team organized the second edition of its flagship data event, DataCon 2023, a full-day conference dedicated to data engineering, cutting-edge data platforms, and effective Data Management to make better business decisions.

The day long technology conference saw participation from members of academia including the likes of IIT Bombay, technology leaders and subject experts from Uber, and leading global technology companies such as Google, Meta along with innovative start-ups such as Browsee, who came together to foster knowledge exchange and collaboration on enhancing the quality and reliability of data at scale.

In today’s business landscape, large companies across various industries generate vast amounts of data, and Uber is no exception. DataCon 2023 exemplified Uber’s unwavering commitment to data innovation and its contribution to the growth of data engineering within the broader tech community. As an organization that prioritizes data, Uber recognizes the profound impact data has on key metrics, compliance adherence, feature performance, and issue identification. Through substantial investments in resilient infrastructure, comprehensive frameworks, advanced tools, and refined data models, Uber efficiently manages data and extracts valuable insights.

Speaking on the power of data to run the business, Manikandan Thangarathnam, Senior Director and Site-Lead for Uber Engineering, Bangalore, said, “As a mobile app and platform based service, data plays a crucial role in developing, managing and innovating Uber’s offerings. Continuous analysis of data is critical to optimize on our app performance and quality delivery of our service. Our engineers leverage data effectively to provide optimal experience for all users of our platform such as the Riders, Eaters, Drivers, and Delivery partners. For example, we utilize data to empower Drivers and Delivery partners to increase their earnings while ensuring Riders and Eaters receive accurate estimated time of arrival (ETA) calculations down to the last second.”

Vijay Mann, Director of Data Platforms at Uber, emphasized the significance of data by saying, “Data serves as a fundamental driver of our operations, fueling our pursuit of excellence. At Uber, we have built a resilient technological infrastructure that empowers us to harness the power of data in making impactful business decisions on a large scale. We consistently navigate challenges, from optimizing Kafka operations and storage to transitioning from Hive to Spark. DataCon ’23 presents a significant opportunity for us to engage with industry peers, exchange insights, and glean valuable learnings from their experiences.”

During DataCon 2023, a series of pivotal discussions and sessions took place, encompassing the following topics:

Unleashing Insights from Extensive Data Volumes using DataSketches: Uber showcased the power of DataSketches as a robust tool for real-time processing and analysis of large data volumes. By providing accurate approximations of data aggregates, DataSketchess optimize memory utilization and computational complexity. Data Lineage for Spark Applications: This session delved into the criticality of data lineage in tracing the path of data flow within the system. Understanding data origins and destinations in Data Warehouse tables becomes possible through data lineage, facilitating comprehensive insights.

The Role of Data in Uber’s Monumental Redesign Initiatives: This session shed light on how data plays a pivotal role in validating system accuracy, resolving customer issues, overseeing deployments, and complying with regulatory obligations. It specifically addressed significant projects like Project One Earner, which involve large-scale fundamental re-architecture.

Data Processing and Compliance Regulatory Reporting for Supply and Geofence Levels: This two-part session explored the remodeling of Uber’s Supply hour and compliance regulatory reporting for geofence level data. Leveraging Supply hour data to achieve compliance objectives effectively was also emphasized.

Empowering the Notebook Experience at Uber: This session highlighted Uber’s Data Science Workbench (DSW) product, an all-encompassing platform for data science. The DSW offers an intuitive web application with fully configured and managed RStudio Server and Jupyter Notebooks, enabling seamless collaboration and efficient data analysis.

The external speakers at Uber DataCon2023 addressed some of the businesses challenges they are solving through data, highlighting the latest advancements in technology. A senior Staff Engineer at Google spoke about how the company is leveraging Cloud Data Fusion, a fully managed, cloud-native, enterprise data integration service for quickly building and managing data pipelines. Yet another senior engineer from Meta discussed the evolution of Scribe, Meta’s internal distributed buffered queuing system that is helping the organisation transports petabytes of data per hour and how it has undergone scale challenges and incremental rearchitecture to ensure its efficiency.