Dienstag, 30. Juli 2013

The Multi-linear Jigsaw of "Candidate Indistinguishability Obfuscation" for functional software encryption (possibly part 1)

This evening I noticed a few retweets about a code obfuscation technique that was reported as being new and groundbreaking. I was intrigued and read a good article article, but was unsatisfied with the level of detail and the number of unanswered questions I was left with: "What is the concept of obfuscation they use; isn't there a proof of impossibility?" "Does it work for all programs?" "How secure, and how provable is this technique?" and "What does 'multilinear jigsaw puzzle' in detail mean?" as well es my curiosity regarding the 'how' of their construction. To answer these questions, I took a look at the paper itself and took some more detailed notes which you can find below. In short, they have a precise notion of security for their understanding of obfuscation as well as a progression of generalizing and securing their core idea.

The concept of obfuscating and its security:

The issue the paper contributes to is as follows: can one make a program, while preserving its functionality, make it unintelligible (so that looking at it doesn't give you a whole lot of information)? There is a result that roughly states that it is almost impossible to have an "encrypted simulation" of a program to obfuscate it, since it can be shown that one can construct efficient unobfuscater that extracts the information used to make the program unintelligible.
A solution to this dilemma is given by a slightly different formulation of this problem, called the indistinguishability obfuscation. Instead of resorting to simple models of computations such as Turing machines, it uses circuits. Given a particular class of circuits $\mathcal{C}$, then an indistinguishability obfuscator $i\mathcal{O}$ should guarantee that for two equivalent ciruits $C_1$ and $C_2$ the two distributions $i\mathcal{O}(C_1)$ and $i\mathcal{O}(C_2)$ are computationally indistinguishable. It has been an open question, whether such an obfuscator exsists for all polynomial sized circuits.
The guarantee of the obfuscator $i \mathcal{O}$ is formalized by security parameters,; the cirucuit class of the obfuscator is actually $\{ C_\lambda \}$. For encoded circuit equipped with a security parameter, e.g. $i\mathcal{O}(\lambda, C)$, there is for any distinguisher $D$ a negligible function $\alpha$ and it holds: for all obfuscations $C_1, C_2$ of the same circuit $C \in \mathcal{C}_\lambda$, the probability of $D$ distinguishing it is bounded by $\alpha(\lambda)$.

The computation model:

The paper proposes an indistinguishability obfuscator for circuits of the class $NC^1$, that is the class of all circuits that are polynomial in size, and have a logarithmic (specifically: $log^1 n$) depth; one can think of a polynomial number of parallel processors with each a logarithmic runtime. It is not known whether this class equals P, but it can be shown that $NC \subset P$ by sequentializing the computations. For another idea about the class: $AC = NC$, where $AC^i$ contains all boolean circuits with depth $O(log^i n)$ and polynomial number of AND/OR gates that admit an arbitrary number of inputs.
The neat feature of $NC^1$ used for the later construction is giving by Barrington's theorem: Every circuit $C$ of $NC^1$ can be transformed into a collection of $k$ square matrices $M^0_1, \ldots M^0_k$ and $M^1_1, \ldots M^1_k$ and model the computation of the circuit, given input $x$ of length $\ell$ by a matrix product $C(x) = 0 \Longleftrightarrow \prod_{i=1}^k M_i^{x f(i)} = I$ with a pre-defined $f: [k] \rightarrow [\ell ]$.

The core construction:

Besides the need for additional security, the matrix computation will be encoded into $k$ groups $G_i$: if an entry of a matrix $M^0_i$ is $\alpha$, e.g. at $(1,1)$, it is encoded as $g_i^\alpha$. Using a multilinear map, the product can be represented in a single group $G_T$.

The ominous multi-linear jigsaw puzzle is described as follows: given a mutilinear map system over groups of prime order p $e: G_1 \times \ldots \times G_k \rightarrow G_T$, a valid multinear form is given by any expression involving operations in $G_T$, the multilinear form and the operations within each group for the parameters of the multilinear map. The papers gives a sample expression for the case of $k = 3$ as $w_1 \cdot e(x_, y_3, z_1 z_2)^2 \cdot e(x_2^3 x_4, y_1^2, z_2 z_5^3) \cdot w^2_3)$, where $x_i \in G_1$, $y_i \in G_2$ and so an, $w_i \in G_T$.

Group elements $g$ then can be viewed as puzzle pieces for the puzzle given by such an expression. A solution is given when the expression yields a unit $g_T \in G_$. The puzzle thus has a generator outputting system parameters $prms$ and some nonempty sets of elements $S_i = \{ x_1^{(i)}, \ldots \} \subset G_i$ and the validator inputs $(prms, S_1, \ldots, S_k, S_T, \prod)$ where the last tuple element is a valid multilinear form. It gives a positiv answer if the form $\prod$ with the given elements gives a unit element of $S_T$.

In order to prove security, the paper makes hardness assumptions of the form: for $g_1, \ldots, g_T$ generators for $G_I$ such that for a form $e$ with $e(g_1, \ldots) = g_T$ it is impossible to distinguish tuples of expotential products of the generators.
The obfuscation scheme:
This construction then is used in a Witness encryption like scheme:
We observe here an analogy to witness indistinguishable proofs: if a statement being proven only has a unique witness, then a witness-indistinguishable proof does not need to hide the witness. The way witness indistinguishability can be used is by explicitly constructing statements that can have multiple witnesses. Similarly, we will use indistinguishability obfuscation by constructing circuits that inherently have multiple equivalent forms. We use this analogy to build our main application of functional encryption.
This is used as
Given an $NP$ Language $L$, a witness encryption scheme for $L$ is an encryption scheme that takes as input an instance $x$ and a message bit $b$, and outputs a ciphertext $c$. If $x \in L$ and $w$ is a valid witness for $x$, then a decryptor can use $w$ to decrypt $c$ and recover $b$. However, if $x \not \in L$, then an encryption of 0 should be computationally indistinguishable from an encryption of 1.
The Indistinguishablity Obfuscation for $NC^1$ implies Witness encryption for an $NP$-Complete language. The takes proceeds to argue for $L=SATISFIABILITY$, using the function $F_{x,b}(w)$ defined by:
  • $F_{x,b}(w) = b$ if w is a valid witness for x
  • $F_{x,b}(w) = \perp$
yields a witness scheme because for $F_{x,b}(w)$ and $SATISFIABILITY$, $F_{x,b}(w)$ is in $NC^1$.
The remainder:
This finishes up this post, being possibly part 1, since the paper continues:
  • to solve the open problem by lifting the $NC^1$ construction to obtain a construction for all polynomial sized circuits
  • provide functional encryption schemes for those circuits
  • obtains a "meaningful" simulation based security for function encryption
  • provides security proofs against several classes of attacks
But I figure that if someone wants to know these details too, they'd have already read the paper before arriving here -- while my questions along the lines of "What are they doing?", "Does it work for all programs?", "How secure, and how provable is this?" are answered sufficiently for now.
The actual paper is at http://eprint.iacr.org/2013/451

Montag, 29. Juli 2013

Categorial constructions (Part 1)

Reading up on a paper, I discovered that I will need Day's convolution to construct tensors in a bicategory of monoidal supported profunctors. The hom-categories of these to $Set$ then capture supported pre-categories as a partial monoid.

So slowly working toward it, I flesh out some of my notes here:

Usually, one encounters convolution in electrical engineering and in image processing; the convolution of two functions $f$ and $g$, denoted by the convolution operator $*$ as $(f * g)$ is defined as

\[ (f*g)(t) = \int_{-\infty}^{\infty} f(\tau) f(t - \tau) d \tau \]

where $f, g: M \rightarrow \mathbb{C}$ are maps from a group to complex numbers. It can be thought of as a moving window, giving by one of the function, that inspects the values of the other function. This definition can be extended to convolutions on functors $F$ on a monoidal category $M$ to the category of sets $Set$, $F: M \rightarrow Set$.

In the standard setting, convolution yields a commutative algebra without identity on the linear space of (suitably) measureable functions. For the extension, the functor category $Set^M$ of functors $F: M \rightarrow Set$ yields a monoidal category; Day's convolution is its tensor product.

A simple example to see how it works is given by graded sets in a categorical setting: given the discrete category $\mathbb{N}$ with natural numbers as objects and only identify mappings as morphisms, where addition plays the role of the monoidal product, the graded sets then can be represented by functors $F: \mathbb{N} \rightarrow Set$. The convolution in the category then is

\[ (F*G)(n) = \sum_{i+j=n} F(i) \times G(j) = \sum_{i,j} F(i) \times G(j) \times hom_{\mathbb{N}(n, i \otimes j) \]

Where $\otimes$ is the addition of natural numbers. Moving on to a more general setting, one wants to replace $\mathbb{N}$ by a general monoidal category $C$. The formula for the convolution, given two presheaves $F, G: C^{op} \rightarrow Set$ is defined as

$ (F*G)(e) = \int^{c,d \in Obj(C)} F(c) \times G(d) \times hom_C(e, c \otimes d)$

In that expression, the convolution product is given by a coend $\int$ of the functor under the integral symbol. The coend of an functor $F$ in $Set$ is an object $e$ with a universal dinatural transformation $\zeta: F \stackrel{..}{\rightarrow} e$. In this case, the notion of dinatural transformation relaxes the requirements of a natural transformation: usually, for a natural transformation $\alpha: F \rightarrow G$ between functors F and G, $\alpha$ depends on some variable $x$ co-/contravariantly. A dinatural, or in this case the slightly more restrictive notion of an extranatural transformation requires either F or G to depend on some variable both co- and contravariantly. It is constituted by a collection of morphisms $\alpha_c: F(c,c) \rightarrow G(c,c)$ such that for every morphism $f: c \rightarrow c'$ in $C$ the hexagon identity holds:

$G(c,f)\alpha_cF(f,c) = G(f,c')\alpha_{c'} F(c',f):F(c',c) \rightarrow G(c,c')$

In this setting, the coend object is an object of $Set$. The Day's convolution has some nice properties regarding the preservation of colimits, as well as some interesting relations to the Yoneda embedding. The next posts will contain more on the properties on the convolution and possibly the Yoneda lemma, but also mainly focus on recovering supported precategories in a more abstract setting.