Texera
Collaborative Machine-Learning-Centric Data Analytics Using Workflows
Install / Use
/learn @apache/TexeraREADME
Goals
- Provide data science as cloud services;
- Provide a browser-based GUI to form a workflow without writing code;
- Allow non-IT people to access data science;
- Support collaborative data science;
- Allow users to interact with the execution of a job;
- Support huge volumes of data efficiently.
Workflow GUI
The Texera interface supports real-time collaboration on data science projects, allowing seamless sharing of data and workflows with easy access to AI/ML techniques and efficient management of public and private resources.
The workflow in the use case shown below includes data cleaning, ML model training, and validation.
Publications (Computer Science)
- (5/2025) Responsive Retrieval of Consistent States in Pipelined Executions of Dataflows
Shengquan Ni, and Chen Li
To appear in HILDA Workshop at SIGMOD 2025 - (11/2024) IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems
Shengquan Ni, Yicong Huang, Zuozhi Wang, and Chen Li To appear in VLDB 2025 - (8/2024) Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs
Xiaozhen Liu, Yicong Huang, Xinyuan Lin, Avinash Kumar, Sadeem Alsudais, and Chen Li
To appear in SIGMOD 2025 - (7/2024) Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
Zuozhi Wang, Yicong Huang, Shengquan Ni, Avinash Kumar, Sadeem Alsudais, Xiaozhen Liu, Xinyuan Lin, Yunyan Ding, and Chen Li
In VLDB 2024, Scalable Data Science track | PDF | Slides - (3/2024) Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows
Yicong Huang, Zuozhi Wang, and Chen Li
In SIGMOD 2024 Best Demo Runner-Up Award🏆 | PDF - (2/2024) Data Science Tasks Implemented with Scripts versus GUI-Based Workflows: The Good, the Bad, and the Ugly
Alexander K Taylor, Yicong Huang, Junheng Hao, Xinyuan Lin, Xiusi Chen, Wei Wang, and Chen Li
In DataPlat Workshop at ICDE 2024 | PDF | Slides
- (8/2023) Building a Collaborative Data Analytics System: Opportunities and Challenges
Zuozhi Wang, Chen Li
In Tutorial at VLDB 2023 | PDF | Slides - (8/2023) Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control
Yicong Huang, Zuozhi Wang, and Chen Li
In SIGMOD 2024 | PDF | Slides - (8/2023) Improving Iterative Analytics in GUI-Based Data-Processing Systems with Visualization, Version Control, and Result Reuse
Sadeem Alsudais Ph.D. Thesis | PDF - (7/2023) Using Texera to Characterize Climate Change Discussions on Twitter During Wildfires
Shengquan Ni, Yicong Huang, Jessie W. Y. Ko, Alexander Taylor, Xiusi Chen, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Suellen Hopfer, and Chen Li
In Data Science Day at KDD 2023 - (7/2023) Raven: Accelerating Execution of Iterative Data Analytics by Reusing Results of Previous Equivalent Versions
Sadeem Alsudais, Avinash Kumar, and Chen Li
In HILDA Workshop at SIGMOD 2023 | PDF - (6/2023) Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
Zuozhi Wang Ph.D. Thesis | PDF - (12/2022) Towards Interactive, Adaptive and Result-aware Big Data Analytics
Avinash Kumar Ph.D. Thesis | PDF - (9/2022) Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees
Zuozhi Wang, Shengquan Ni, Avinash Kumar, and Chen Li
In VLDB 2023 | PDF | Slides - (7/2022) Drove: Tracking Execution Results of Workflows on Large Datasets
Sadeem Alsudais
In the Ph.D. Workshop at VLDB 2022 | PDF - (6/2022) Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models
Zhihui Yang, Yicong Huang, Zuozhi Wang, Feng Gao, Yao Lu, Chen Li, and X. Sean Wang
In VLDB 2022 | PDF - (6/2022) Demonstration of Collaborative and Interactive Workflow-Based Data Analytics in Texera
Xiaozhen Liu, Zuozhi Wang, Shengquan Ni, Sadeem Alsudais, Yicong Huang, Avinash Kumar, and Chen Li
In VLDB 2022 | PDF | Demo Video - (4/2022) Optimizing Machine Learning Inference Queries with Correlative Proxy Models
Zhihui Yang, Zuozhi Wang, Yicong Huang, Yao Lu, Chen Li, and X. Sean Wang
In VLDB 2022 | PDF - (7/2020) Demonstration of Interactive Runtime Debugging of Distributed Dataflows in Texera
Zuozhi Wang, Avinash Kumar, Shengquan Ni, and Chen Li
In VLDB 2020 | PDF | Video | Slides - (1/2020) Amber: A Debuggable Dataflow system based on the Actor Model
Avinash Kumar, Zuozhi Wang, Shengquan Ni, and Chen Li
In VLDB 2020 | PDF | Video | Slides - (4/2017) A Demonstration of TextDB: Declarative and Scalable Text Analytics on Large Data Sets
Zuozhi Wang, Flavio Bayer, Seungjin Lee, Kishore Narendran, Xuxi Pan, Qing Tang, Jimmy Wang, and Chen Li
In ICDE 2017 Best Demo award | PDF | Video
Publications (Interdisciplinary):
- (2/2025) DS4ALL: Teaching High-School Students Data Science and AI/ML Using the Texera Workflow Platform as a Service
Jiadong Bai, Xiaozhen Liu, Anthony Cuturrufo, Alexander Kundu Taylor, Jeehyun Hwang, Mingyu Derek Ma, Xinyuan Lin, Yanqiao Zhu, Yicong Huang, Yunyan Ding, Wei Wang, and Chen Li
To appear in Data Science Education K-12: Research to Practice Annual Conference 2025 - (7/2024) Brain Image Data Processing Using Collaborative Data Workflows on Texera
Yunyan Ding, Yicong Huang, Pan Gao, Andy Thai, Atchuth Naveen Chilaparasetti, M. Gopi, Xiangmin Xu, and Chen Li
In Frontiers Neural Circuits | PDF - (1/2024) Wording Matters: The Effect of Linguistic Characteristics and Political Ideology on Resharing of COVID-19 Vaccine Tweets
Judith Borghouts, Yicong Huang, Suellen Hopfer, Chen Li, and Gloria Mark
In TOCHI 2024 | PDF - (1/2024) How the Experience of California Wildfires Shape Twitter Climate Change Framings
Jessie W. Y. Ko, Shengquan Ni, Alexander Taylor, Xiusi Chen, Yicong Huang, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Chen Li, and Suellen Hopfer In Climatic Change 2024 | PDF - (11/2023) The Marketing and Perceptions of Non-Tobacco Blunt Wraps on Twitter
Joshua U. Rhee, Yicong Huang, Aurash J. Soroosh, Sadeem Alsudais, Shengquan Ni, Avinash Kumar, Jacob Paredes, Chen Li, and David S. Timberlake In Substance Use & Misuse 2023 | [PDF](https://www.
Related Skills
feishu-drive
349.0k|
things-mac
349.0kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
349.0kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
