Discover more from Generational
Future Unicorn #228: OneSchema
The Quild Future Unicorn is a weekly product-focused note highlighting one early-stage startup with statistically significant signals of becoming a unicorn.
OneSchema is an embeddable spreadsheet importer and validator. Product and engineering teams use OneSchema to avoid the costly and complicated process of building and maintaining spreadsheet import. Designed for businesses of all sizes, OneSchema empowers product and engineering teams to launch beautiful, performant, fully customized spreadsheet importers in hours, not months.
Andrew was an engineer at Affinity (3+ yrs)
Top company alumni
Christina was a product manager at Google (4 yrs)
General Catalyst led seed
Sequoia Capital participated in seed
Box Group participated in seed
Top university alumni
Christina graduated from Stanford University (BS)
Andrew graduated from Stanford University (BS)
Embeddable spreadsheet importer
Pain point and persona
Underlying every SaaS product are data models to support basic functions from user registration to do the doings it was built for (a Digitally Baffled reference). For example, a CRM needs the data fields account name, creation and close dates, assigned account manager, and many more. An HRIS needs employee name, date of birth, SSN, bank account number, any many more. Developers build the data models and its relationship to each other - like the chart below for Salesforce's Sales Cloud.
As organizations adopt new products, they need to migrate current data to fit the new software's data model. This has been traditionally been done manually. But as software becomes more self-serve, software companies have increasingly built self-serve data importers for users. Spreadsheet data in the form of CSVs (comma separated values) and XLSXs (Excel) are the most common formats that business users deal with. The challenge then becomes validating that the imported data is formatted and mapped correctly. Building a scalable and easy-to-maintain impoter (data models change as a product matures) takes months of developer time.
OneSchema is an embeddable CSV (for now) importer that developers can plug into their product easily. There are four main components to OneSchema's product: parser, data validation, error handling, and mapping.
Parser: processes structured data (e.g. CSV), interpreting its structure, and extracting the individual data elements that it contains into a spreadsheet. Basically, it "reads" the CSV or Excel file.
Data validation: checks each cell in the spreadsheet using rules. It determines how data should be cleaned. Validations are applied to data to ensure the data meets specific requirements (e.g. business logic) - imagine interchanging European and US date formats. OneSchema has 30+ pre-built validators from dates to country codes to SSN. Developers can also build custom validators.
Error handling: After error detection comes error fixing. Some errors are easy to fix, like capitalization. Some are more difficult, like interpolating missing values - should you use mean, median, mode, percentiles? The user experience is important as well. OneSchema can apply hot fixes in bulk and navigate & filter errors. Sometimes, select data points just don't make sense and the end user needs to corroborate with colleagues for troubleshooting. OneSchema has a neat quality-of-life feature that allows exporting the data to an Excel file with errors highlighted and annotated. In large data files, this is a life saver. Imagine having to annotate hundreds to thousands of cells.
Mapping: Mapping the data is linking the spreadsheet columns to the data model fields. It is a relatively straightforward feature. OneSchema has another neat quality-of-life feature that intelligently suggests mappings.
If you have more time, here's a 3-min product demo.
What I'm trying to understand this coming holiday break: Implications of Geoffrey Hinton's forward-forward algorithm presented at NEURIPS. If you've thought about it or read it, would love to pick your brain over coffee/meal/eggnog.