google-research

Форк
0

README.md

Differentially Private Alternating Minimization

This code accompanies the papers

Algorithms

This library provides algorithms for training personalized models with differentially private alternating minimization (DPAM).

DP personalized models are of the form

$$f_i(x) = \langle u_i, v(x) \rangle$$

where $u_i$ is a user embedding (specific to each user i), and $v(x)$ is an item encoder shared among all users. (Notice that each user i has a different prediction $f_i$.) Applications include recommendation systems, where the encoder $v(x)$ learns a (shared) representation of the items to recommend, the user embedding $u_i$ encodes the user-specific preferences, and the dot product $\langle u_i, v(x) \rangle$ captures the relevance of item $x$ to user $i$. These models are also known as multi-encoder models and are important in federated learning.

DP alternating minimization alternates between:

  1. Solving for the user embeddings $u_i$, using (exact) least squares
  2. Training the shared encoder $v(x)$, using DP.
  • An important special case is matrix completion, where the encoder $v(x)$ is linear. In this case, step 2) is solved using DP least squares, see Algorithm 1 in DPALS.
  • For general encoders, step 2) is solved using DPSGD.

Distribution Skew and Adaptive Budget Allocation

In practice, there is typically a long tail of items that have much fewer training examples (in particular for recommendation problems). It has been known that private models are particularly susceptible to this type of distribution skew.

The paper Multi-Task Differential Privacy Under Distribution Skew proposes a method to properly handle the distribution skew, by adaptively allocating the privacy budget of each user among the different items (see Algorithms 1 and 2 in the paper). At a high level, by allocating more privacy budget to tail items, one can improve the overall quality.

The library provides an implementation of this adaptive budget allocation.

Colab notebook

The notebook dpam.ipynb provides examples for training (non-private and private) models on the MovieLens benchmarks used in the two papers, along with tuned hyper-parameters at different $\epsilon$ levels. Link to open in colab

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.