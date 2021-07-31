



The MetroLab Network has partnered with Government Technology to provide readers with a segment called the MetroLab Innovation of the Month series. This segment focuses on influential technology, data and innovation projects underway between cities and universities. If you would like to know more or contact the project leader, please contact MetroLab ([email protected]) for more information.

This month’s Innovation of the Month series of articles focuses on Equi Tensors. This is a project that reflects and raises awareness of data science and AI applications, opportunities, and potential misuse applied to mobility and transportation. Fairness and diversity. MetroLabs Josh Schacht spoke with Project Leader Bill Howe, Associate Professor of Informatics at the University of Washington, Associate Professor of Computer Science and Engineering, and Associate Director and Senior Data Science Fellow of the UW eScience Institute.

Josh Schacht: Please tell us about the origin of this project and the big picture issues it is working on.

Bill Howe: I started the EquiTensors project after observing that most open data wasn’t really underutilized. There are many reasons for this, and the reliability, source, and structure of the data can make it difficult to use. So we start thinking about what people usually want to do with the data, and more and more people want to train some predictive model.

But cities are complex systems, so everything interacts with everything else. Waste management services can depend on traffic volume, which can depend on income distribution as well as weather. The Travel Planner app may need to work with hundreds of datasets to train a good model.

So I started thinking of the idea that there are thousands of datasets where all of the city’s dynamics are in different windows. What if we could collect these dynamics more directly? If so, anyone can enable the prediction application without having to find, download and process hundreds of interrelated datasets. Instead, you can use a learned feature called EquiTensors.

Schacht: How does this project improve existing forecasting methods?

Howe: Companies and institutions using EquiTensors reduce the effort of downloading and processing multiple datasets by including signals from datasets that are otherwise inaccessible, and protect them from making unreasonable predictions. You can reduce training time and improve accuracy. In addition, you may be surprised at which dataset is predictive. EquiTensors’ kitchen sink approach eliminates this decision.

Schacht: Please tell us how this project focuses on fairness and privacy.

Howe: An important requirement was to combat discrimination. All city data reflects decades of historical discrimination. For example, the 100-year-old racist redlining is influencing today’s home prices and racial demographics. You cannot allow the predictive model to propagate these signals. That’s why we employ fair machine learning techniques to reduce unnecessary correlation between sensitive and other attributes. As a result, the learned features can better represent the data from the world we wanted, rather than the world we have.

While the current iteration does not provide a strong privacy guarantee, a single control point provided by EquiTensors provides an opportunity to securely disclose personal information without exposing raw data. This means that agencies and businesses can train EquiTensors using private data that exposes only the features they have learned, rather than personally identifiable information. For a stronger guarantee, differential privacy schemes can be adapted to this setting (at the expense of utilities).

Schacht: What’s the next step in this project? Where do you think you are going from here?

Howe: The next step is to interpolate space and time missing data using deep learning techniques, transfer learning research to enable Seattle-trained EquiTensors to be applied in Chicago, and new EquiTensors. It includes synthesizing data in context and investigating the relationship between explainability and fairness. ..

