Privacy Risks of Fake Detection
RUFake was expressly designed to mitigate a broad range of privacy risks. This puts us ahead of our Competitors.
- Risks of Correctly Determining that a photo is ‘fake’. This has legal, financial, social and psychological implications for you, the data owner (photo provider) and for RUFake.
- Risks of Falsely Determining if a photo is ‘fake’. In other words, the photo is real. The photo owner may not wish others to know that they shared this with you, or may be upset that somebody else shared this with you – for instance, if they were in a sensitive scenario.
- Risks of biased analysis. This may include algorithms that preferentially tag fake when analyzing photos from groups or certain people, such as celebrities, or types of people, such as women or men. RUFake prevents this risk through rigorous selection of datasets and AI modeling approaches.
- Risk of disclosure or loss of user information. There are several risks including identity disclosure, attribute disclosure and membership disclosure. What if a photo you upload is sensitive – such as from a restricted jurisdiction – or the person was not authorized to be in that location? RUFake mitigates this risk by cropping background information from your photo, and by not storing data.
Our Privacy Goals
In designing RUFake we formally assessed risk including several factors.
- Prevent data leakage. This is ensured by storing no information.
- Reduce the risks of aggregation, as outlined for specific data types in our recent paper1. This should cover potential secondary uses of photo authenticity determination for legal or business purposes.
- How long should data security last? This is open ended since user information, raw data, authenticity determination or other intermediate analyses have undefined storage durations. Our policy is to store no information.
- A plan for enable data review, editing and deletion. Our policy is to store no information.
- Mitigation Plan for inaccurate authenticity determination. This is important given the lack of objective ground truths for many use cases. One plan includes augmenting training by aggregating more data. Another plan includes augmenting training data with synthetic modification to algorithm accuracy for calibrated modifications. Our policy is to store no information.
- Disclaimers for inaccurate authenticity determination, including the limitations of training data, limited languages or dataset sizes. Our disclaimers include confidence intervals to convey this uncertainty.
References
A comprehensive literature on online privacy concerns and solutions can be found here. A selected reference by our group cited on this page is:
- S. M. Narayan, N. Kohli and M. M. Martin Addressing contemporary threats in anonymised healthcare data using privacy engineering Nature Publishing Group Digit Med 2025 Vol. 8 Issue 1 Pages 145 Accession Number: 40050672 DOI: 10.1038/s41746-025-01520-6