\PassOptionsToPackage{unicode=true}{hyperref} % options for packages loaded elsewhere
\PassOptionsToPackage{hyphens}{url}
%
\documentclass[
]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{textcomp} % provides euro and other symbols
\else % if luatex or xelatex
\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
\KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
\hypersetup{
pdfborder={0 0 0},
breaklinks=true}
\urlstyle{same} % don't use monospace font for urls
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{-2}
% Redefines (sub)paragraphs to behave more like sections
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
% set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\date{}
\begin{document}
\hypertarget{cs-127-spring-2020---homework-zero}{%
\section{CS 127 Spring 2020 - Homework
Zero}\label{cs-127-spring-2020---homework-zero}}
Due on \textbf{Thursday January 30 2020 at midnight} but I recommend you
do this homework even before the first lecture.
\textbf{Some policies:} (See the
\href{http://cs127.boazbarak.org/syllabus/}{course syllabus} for the
full policies.)
\begin{itemize}
\item
You can collaborate with other students that are currently enrolled in
this course in brainstorming and thinking through approaches to
solutions but you should write the solutions on your own and cannot
share them with other students.
\item
Sharing questions or solutions with anyone outside this course,
including posting on outside websites, is a violation of the honor
code policy. Collaborating with anyone except students currently
taking this course or using material from past years from this or
other courses is a violation of the honor code policy.
\item
The submitted PDF should be typed and in the same format and
pagination as ours. Please include the text of the problems and write
\textbf{Solution X:} before your solution. (This is easy to do if you
use our markdown template.) Please mark in gradescope the pages where
the solution to each question appears. Points will be deducted if you
submit in a different format or do not mark the pages.
\end{itemize}
\textbf{By writing my name here I affirm that I am aware of all policies
and abided by them while working on this problem set:}
\textbf{Your name:} (Write name and HUID here)
\textbf{Collaborators:} (List here names of anyone you discussed
problems or ideas for solutions with)
\textbf{Number of late days used so far:} (not including this pset; it
is your responsibility to make sure you do not go over the late days
budget)
\textbf{Number of late days used for this pset:}
\newpage
\hypertarget{probability-questions}{%
\subsection{Probability questions}\label{probability-questions}}
As we'll see in the first lecture, much of cryptography relies on
probability theory, and so basic knowledge of probability will be
essential. The
\href{https://www.introtcs.org/public/lec_15_probability.pdf}{CS 121
probability review} is one source for the probability theory we will
need. During the course I will assume you are familiar with (or can pick
up on your own) all notions presented there. However, if you find these
questions unfamiliar or difficult you should not despair! There are
plenty of sources on probability on the web, and in particular Harvard
STAT 110 and its textbook are of course wonderful resources. If any of
the notation is unfamiliar, looking at the lecture notes might help, and
otherwise feel free to ask questions on Ed, even before the semester
starts!
\textbf{Notation:} While often in probability theory people use the name
``random variable'' for a distribution over the set \(\mathbb{R}\) of
real numbers, it will be convenient for us to generalize this to
arbitrary sets, and hence we will use the following notation. We define
a \emph{random variable} or \emph{distribution} \(X\) over a finite set
\(S\) to correspond to the probabilistic experiment where we draw an
element \(x\) from \(S\) with some probability, which we denote by
\(\Pr[ X=x]\). All we need from these probabilities is that they are
non-negative and sum up to one. (One can also consider distributions
over infinite sets, though almost always in this course we will restrict
ourselves to the finite case.) We use \(x \leftarrow_R X\) as shorthand
for saying that \(x\) is drawn accoring to the distribution \(X\). If
\(f:S \rightarrow T\) is a function, then the random variable \(f(X)\)
corresponds to the probabilistic experiment where we draw
\(x \leftarrow_R X\) and output \(f(x)\).
\textbf{Question 1 (30 points):} In this question we will study the
notion known as
\href{https://en.wikipedia.org/wiki/Total_variation_distance_of_probability_measures}{Total
Variation} or statistical distance. It is a basic notion of distance
between probability distribution, and its computational analog is
fundamental for cryptography.
\textbf{Question 1.1 (15 points):} If \(X\) and \(Y\) are two
distributions over the same set \(S\), we define the \emph{statistical
distance} of \(X\) and \(Y\), denoted as \(\Delta(X,Y)\) (also known as
\emph{total variation} distance of \(X\) and \(Y\)) to be
\(\sum_{x\in S} | \Pr[X=x]- \Pr[Y=x] |\). Prove that for every function
\(f:S \rightarrow [0,1]\),
\(| \mathbb{E}[f(X)] - \mathbb{E}[f(Y)] | \leq \Delta(X,Y)\).
\textbf{Solution 1.1:}
\textbf{Question 1.2 (15 points):} Prove that the statistical distance
satisfies the \emph{triangle inequality}: For every three distributions
\(X,Y,Z\) over the same set \(S\),
\(\Delta(X,Z) \leq \Delta(X,Y)+ \Delta(Y,Z)\).
\textbf{Solution 1.2:}
\textbf{Question 2 (40 points + 10 points bonus):} We will now use the
notion of statistical distance to study one of the most basic questions
in probability theory: if we are given a coin that is either completely
unbiased, or has bias \(\epsilon>0\) towards ``heads'', how many tosses
will it take for us to distinguish between the two cases.
\textbf{Question 2.1 (15 points):} Prove that we can distinguish between
an unbiased coin and one that has \(\epsilon\) bias towards ``heads''
using at most \(O(1/\epsilon^2)\) coin tosses. Specifically prove that
if \(k>100/\epsilon^2\), then there exists some function
\(f:\{0,1\}^k \rightarrow \{0,1\}\) such that if \(X\) is the uniform
distribution over \(\{0,1\}^k\) and \(Y\) is the distribution obtained
by tossing \(k\) independent coins, each equaling \(1\) with probability
\(1/2+\epsilon\) and equaling \(0\) with probability \(1/2-\epsilon\),
then \(\Pr[ f(X)=0 ] > 0.9\) and \(\Pr[ f(Y) = 1 ] > 0.9\), and hence
\(f\) can distinguish between \(X\) and \(Y\).\footnote{\textbf{Hint:}
Use the Chernoff bound.}
\textbf{Solution 2.1:}
\textbf{Question 2.2 (15 points):} We now study the converse problem:
showing a \emph{lower bound} on the number of coin tosses needed. (This
problem involves some notation, so take your time reading it carefully
and parsing what it means.) We will derive this bound using the
combination of Question 2.2 and Question 2.3.
For every \(k \in \mathbb{N}\), \(0 \leq i \leq k\), and \(\epsilon>0\),
let \(X^{k,\epsilon}_i\) be the following distribution over
\(\{0,1\}^k\): the first \(i\) bits of \(X^{k,\epsilon}_i\) are chosen
independently and uniformly at random, and the last \(k-i\) bits are
chosen independently at random but each is equal to \(1\) with
probability \(1/2+\epsilon\) and equal to \(0\) with probability
\(1/2-\epsilon\). Prove that for very \(k\), \(i 0.9\)
and \(\Pr[ f(Y) = 1 ] > 0.9\). Use your solution for Question 2.1, the
triangle inequality proved in Question 1.2, and the result of Question
1.1 (this is a special case of the
\href{https://en.wikipedia.org/wiki/Hybrid_argument_(Cryptography)}{Hybrid
Argument} which is used time and again in cryptography).
\textbf{Solution 2.3:}
\textbf{Question 2.4 (bonus problem: optional and more challenging - 10
points bonus):} Prove that in fact
\(\Delta(X,Y) \leq O(\sqrt{k}\epsilon)\). See footnote for
hint.\footnote{\textbf{Hint:} One way to prove this is to use the notion
of KL divergence which is another notion of distance between the
distributions that satisfies the triangle inequality. You can show
that the KL divergence of \(X^{k,\epsilon}_i\) and
\(X^{k,\epsilon}_{i+1}\) is \(O(\epsilon^2)\) and then use the Pinsker
Inequality that shows that the statistical distance between two
distributions is at most the square root of their KL divergence. You
can read about both KL divergence and the Pinsker Inequality in
Wikipedia as well in several other sources.} Conclude that the method
of Question 2 is essentially \emph{optimal} in the sense that there
exist some absolute constant \(\delta\) (independent of \(\epsilon\))
such that for every \(\epsilon\) and distributions \(X,Y\) as in
Question 2, if \(k< \delta/\epsilon^2\) then there does not exist a
function \(f:\{0,1\}^k \rightarrow \{0,1\}\) such that
\(\Pr[ f(X)=0 ] > 0.9\) and \(\Pr[ f(Y) = 1 ] > 0.9\).
\textbf{Solution 2.4:}
\hypertarget{and-now-for-a-little-crypto}{%
\subsection{And now for a little
crypto}\label{and-now-for-a-little-crypto}}
\textbf{Question 3 (very important! 10 points):} Read the lecture notes
for
\href{https://www.intensecrypto.org/public/lec_01_introduction.pdf}{lecture
1: introduction}.
\textbf{Solution 3:} Write here ``I affirm that I read all of the
lecture notes''. Unlike CS 121, my lectures in CS 127/227 will be under
the assumption that all students had read seriously the lecture notes
before each lecture. The notes are unpolished (to say the least) and so
you may well run into typos/bugs and unclear points while reading them.
When you do, feel free to ask questions on Ed and or to post issues or
pull requests on the GitHub repository.
\textbf{Question 4.1 (20 points):} Prove that an encryption scheme
\((E,D)\) with messages of length \(\ell\) and keys of length \(n\) is
\emph{perfectly secret} if and only if for every
\(x,x' \in \{0,1\}^\ell\), \(\Delta(Y^x,Y^{x'})=0\), where \(Y^x\) is
the distribution obtained by choosing a random
\(k \leftarrow_R \{0,1\}^n\) and outputting \(E_k(x)\).
\textbf{Solution 4.1:}
\textbf{Question 4.2 (20 points - bonus but please do attempt it):} Let
\(\epsilon>0\). Define an encryption scheme \((E,D)\) with messages of
length \(\ell\) and keys of length \(n\) to be \(\epsilon\)-secret if
for every \(x,x' \in \{0,1\}^\ell\), \(\Delta(Y^x,Y^{x'}) < \epsilon\).
Suppose that \((E,D)\) is \(\epsilon\) secret. Prove that for every
adversary Eve, which we model as function
\(Eve:\{0,1\}^m \rightarrow \{0,1\}\) where \(m\) is the length of the
ciphertexts, and a pair of messages \(x_0,x_1 \in \{0,1\}^\ell\), the
probability that Eve wins in the game described below is less than
\(1/2+ 10\epsilon\).
The game is defined as follows:
\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
We pick \(k\) at random in \(\{0,1\}^n\).
\item
We pick \(b\) at random in \(\{0,1\}\).
\item
We let \(y = E_k(x_b)\).
\item
We let \(b' = Eve(y)\).
\item
Eve \emph{wins} iff \(b'=b\).
\end{enumerate}
\textbf{Solution 4.2:}
\end{document}