LocLok: Protecting Location Privacy in Smartphones
Concerns on location privacy frequently arise with the rapid development of GPS enabled devices and location-based applications. However, there is no offically privacy-preserving cell phone apps which can protect users' location privacy while enabling the GPS based services. In academia, the popular method is to replace a position of latitude and longitude with a randomly generated area, called spatial cloaking. For more details, please go to http://forum.loclok.com.
Click here to view a demo system.
Our technique can rigorously protect location privacy even when the attacker is acquainted with the user. For example, the attacker knows exactly the moving habit of the user and the historically visited places of the user. Our technique can still prove the current location privacy is protected by the state-of-art differential privacy. Furthermore, we prove that our technique is the optimal solution to satisfy the guarantee of differential privacy.
I show an example as follows. Figure 1 shows a real trajectory on a real map where the two axes represent longitude and latitude. A user travaled in 500 timestamps. The corresponding grid map of the same trajectory is shown in Figure 2. Then Figure 3 shows the released trajecotry of existing method; while Figure 4 is the released trajectory of our method. Obviously our method provides more utility, which is also verified in Figure 5 where the distances between the true locations and released locations are demonstrated. LM is existing method of Laplace Mechanism; PIM is our method of Planar Isotropic Mechanism. The formal proof and experiment settings can be found in our paper.
DPCube: Releasing Differentially Private Data Cubes for Health Information
As we all know that health data is highly private. For example, a patient's disease should never be exposed to anyone without the patient's consent. On the other hand, health data is very useful for clinical treatment improvement, or medical research institute. Then the question is how to release health data so that it can protect the sensitive part while releasing the useful part. Or generally, if we have a database containing both private and useful information, how to use such data without breaching any privacy?
DPCube is a practical and rigorous method to tackle this. It satisfies the state-of-art differential privacy guarantee for privacy preservation. On the other hand, it provides useful information of the data. The brief components of DPCube is desribed as follows.
We also have the Matlab code to implement DPCube, which is available upon request. Following figure shows the interface of a data releasing scenario.
Adaptive Differentially Private Data Release
I further dig into the theory of differential privacy and investigate a general data releasing method with differential privacy. This method is still under development. Click here to check an introduction of this project, which is supported by NSF.
An Adversarial Machine Learning Model
ML model is known for it's vulnerability under adversarial attacks. Even for the state-of-art ML models, it can be easily fooled by adding some amount of noise.
In the following example, we shows that a cat (on the left) in tiny ImageNet can be misclassfied to other objects when adding little noises which are not even perceived by humans. Although the three images on the right side look similar to the original cat, they are classified as sea cucumber, desk and dragonfly. Note that the ML model is the ResNet18, one of the best vision models as of 2018 with about 76% accuracy.
The attacking time to generate one adversarial sample is about 60~90 seconds. Our code uses the Carlini algorithm, and is available upon request. Also note that the perturbation can also be less than 1.0 pixel (in the 0~255 image scale) under the attack.
A P2P Video Sharing System
Many of us watch online videos everyday, either on Youtube or other video providers. However, the technique behind the video sharing is not hard. Here I show a simple video sharing system with the original code.
In my system, there are three major parts, a tracker, a super peer and many peers. The tracker is a MySQL database service. It contains the all the information about the availabe videos. If a user, which is a peer, wants to browse the videos of our system, it sends a query request to the tracker, which returns all the channels to the peer. Each of the channel has a super peer, containing the detailed information of the channel. If a user selected a channel, then it will build a TCP connection with the super peer. Then the super peer accepts the TCP connection and add the user to the neighbors of audiences.
In the TCP connection, the video content is transformed in the unit of data packets. As shown in the following figure, a peer has a data buffer area to contain these packets.
The classes used in this program is summarized as follows.
Deep Learning Certificate
I obtained the Deep Learning certificate from deeplearning.ai.
Machine Learning Certificate
I obtained the Machine Learning certificate from Stanford University.
Yonghui (Yohu) Xiao
Hi, I am a Google engineer since May 2017. Before joining Google, I was a Ph.D. student in CS department at Emory University. Prior to Emory, I received 3 bachelor degrees at Xi'an Jiaotong University in 2005. After graduation, I worked at IVO in China. In 2008, I became a graduate student in CS department at Tsinghua University, where I was lucky to join the collaborative Tsinghua-Emory research project. After 3 years researching work, I felt that I should focus on this prospective area and follow the passion of solving real-world problems. In 2011, with the kind help of Li, I became a Ph.D. student at Emory University. I interned at Samsung Research America (SRA) in summer 2014.
Click here to download my CV.
My research mainly focuses on data privacy protection. For example, do you worry if your location data gets stolen while using Yelp to find the nearest restaurants, or your browsing history was exposed while searching something on Google? While the concerns about data privacy arise frequently, the technique of privacy preservation still falls behind. The trick is that you have to provide your information to get the service. For example, you must give your location to Yelp in order to get the nearest restaurants. So how can we set the balance of privacy and utility? While the silver lining of this problem, called differential privacy, is still under debate in academia, practical privacy-preserving technique still has a long way to go.
I am happy to work on this area, which has enormous opportunities and so many possibilities. I believe doing is much more important than talking, which is why I am also developing an iPhone app ("LocLok") with the state-of-art technique in this area (^_^).
Program Committee: student PC of IEEE SP (IEEE Symposium on Security and Privacy) 2016, 2017; paper reviewer of TDSC (IEEE Transactions on Dependable and Secure Computing) 2017; paper reviewer of TKDE (IEEE Transactions on Knowledge and Data Engineering) 2016; paper reviewer of TIFS (IEEE Transactions on Information Forensics and Security) 2016; paper reviewer of TOPS (ACM Transactions on Privacy and Security) 2016; paper reviewer of TMC (IEEE Transactions on Mobile Computing) 2016.
External reviewer for various conferences including KDD(the ACM Conference on Knowledge Discovery and Data Mining) 2010, SIGMOD (the ACM Conference on Management of Data) 2011 Demo, VLDB (the ACM Conference on Very Large Data Base) 2012, CIKM (the ACM Conference on Information and Knowledge Management) 2013 demo, ICDE (International Conference on Data Engineering) 2014, CCS (the ACM Conference on Computer and Communications Security) 2014, CIKM PSDB 2014, TDSC (IEEE Transactions on Dependable and Secure Computing) 2015, TKDE (IEEE Transactions on Knowledge and Data Engineering) 2015, CIKM 2016.
I learned a new skill from SIGMOD 2016, Blind-reject. As a victim of this devastating technique, I would like to share it with you (click on it). If I was a PC member, I will never blind-reject a paper.
I love Math, especially vector space and matrix computations. I am fascinated by black hole theory, string theory and 11-dimensional universe (or maybe multiverse). I like to play guitar, but not very good at it. I like to jog in gym, which helps me calm down and relax.
- Yonghui Xiao, Li Xiong. Methods and systems for determining protected location information based on temporal correlations. US Patent 9,867,041
- Yang Cao, Shun Takagi, Yonghui Xiao, Li Xiong, Masatoshi Yoshikawa. PANDA: Policy-aware Location Privacy for Epidemic Surveillance. 46rd International Conference on Very Large Data Base (VLDB) demo 2020
- Yang Cao, Yonghui Xiao, Li Xiong, Liquan Bai and Masatoshi Yoshikawa. Protecting Spatiotemporal Event Privacy in Continuous Location-Based Services. IEEE Transactions on Data and Knowledge Enginnering (TKDE) 2019
- Yang Cao, Yonghui Xiao, Li Xiong, Liquan Bai, Masatoshi Yoshikawa. PriSTE: Protecting Spatiotemporal Event Privacy in Continuous Location-Based Services. 45rd International Conference on Very Large Data Base (VLDB) demo 2019.
- Yang Cao, Yonghui Xiao, Li Xiong, Liquan Bai. PriSTE: From Location Privacy to Spatiotemporal Event Privacy. International Conference on Data Engieering (ICDE) 2019, poster
- Yang Cao, Li Xiong, Masatoshi Yoshikawa, Yonghui Xiao, Si Zhang. ConTPL: Controlling Temporal Privacy Leakage in Differentially Private Continuous Data Release. Proceedings of the VLDB Endowment (PVLDB), demo, 2018
- Yang Cao, Masatoshi Yoshikawa, Yonghui Xiao and Li Xiong. Quantifying Differential Privacy in Continuous Data Release under Temporal Correlations. IEEE Transactions on Data and Knowledge Enginnering (TKDE), 2018
- Yonghui Xiao, Li Xiong, Si Zhang, Yang Cao. LocLok: Location Cloaking with Differential Privacy via Hidden Markov Model. 43rd International Conference on Very Large Data Base (VLDB) demo, 2017
- Yang Cao, Masatoshi Yoshikawa, Yonghui Xiao, Li Xiong. Quantifying Differential Privacy under Temporal Correlations. IEEE International Conference on Data Engineering (ICDE), 2017
- Xiaofeng Xu, Li Xiong, Vaidy Sunderam, Yonghui Xiao. A Markov Chain Based Pruning Method for Predictive Range Queries. ACM SIGSPATIAL, 2016
- Yonghui Xiao, Li Xiong. Protecting Locations with Differential Privacy under Temporal Correlations. 22nd ACM Conference on Computer and Communications Security (CCS), 2015
- Yonghui Xiao, Li Xiong, Liyue Fan, Slawomir Goryczka, Haoran Li. DPCube: Differentially Private Histogram Release through Multidimensional Partitioning, Transactions of Data Privacy (TDP), 7:3 (2014) 195 - 222, 2014.
- James Gardner, Li Xiong, Yonghui Xiao, Jingjing Gao, Andrew Post, Xiaoqian Jiang, Lucila Ohno-Machado. SHARE: System Design and Case Studies for Statistical Health Information Release. Journal of the American Medical Informatics Association (JAMIA), 20(1), 2013
- Xiao Y, Gardner J, Xiong L. DPCube: Releasing Differentially Private Data Cubes for Health Information. International Conference on Data Engieering (ICDE) demo 2012
Xiao Y, Xiong L, Yuan C. Differentially Private Data Release through Multidimensional Partitioning. 7th VLDB Workshop on Secure Date Management, Singapore, SINGAPORE, SEP 17, 2010.
Amazon graduate research symposium, 2017
IEEE S&P student PC travel award, 2017
NSF I-Corps award as entrepreneur lead, 2016
CCS travel award, 2015.
NSF ICDE scholarship, 2012.
Ph.D. Fellowship, Laney Graduate School of Emory University, 2011-Present
Scholarship of Foxconn at Tsinghua University, Beijing, China 2009.
Welcome to contatct me if you have any concerns (^_^).
Please find my conatact information below.
To Contact me
Email: yhandxiao AT gmail dot com
Office: Mountain View office of Google
Phone: 404-772-0x1c2d9401 where the last four digits were encrypted by RSA. You are welcome to call me if you can crack those last four digits.