The blog was started under the title "mi facki lei cinri zasti" which translates to English as "I discover (all the) interesting things". It features my discoveries as well as my musings on them.
Sonntag, 29. Dezember 2013
Some intros to NoSQL databases
In general, I am a bit late for the whole NoSQL party, but nevertheless here are some interesting articles:
Samstag, 12. Oktober 2013
A bunch of questions about founding a start up comany
- At what point should I legally found my company? Only once I am certain that my product and business model will work out? Or should I incorporate before I handle any data from businesses, or from private customers?
- If I were to collect data before legally founding my company, how can I legally transfer the prototype and the data to the company?
- In Germany, having an imprint on your website and following data protection laws is very important. Failure to correctly comply can easily lead to notices from lawyers paid by competitors - without a limited liability company, how can I protect myself?
- How do I handle expenses I incur before officially founding my company? What should I do about them if my plans do not materialize before legally founding? Are there any tax breaks that I can claim from pre-founding investments or expenditures?
- Can I "hire" any help in advertising/marketing before founding a company?
- How much collateral beyond the first necessary Euro do I need in reality? Does that have any practical impact on how my company can interact with other companies?
- What other costs can I avoid? I have to pre-pay taxes, but how should I go about making good estimate of what to pay? Should I try to over or underestimate? I can avoid the IHK membership fee at the beginning?
- How should I handle expenses I privately financed before founding, e.g. hosting a landing page, some marketing costs, money spent to research the market and buy publications?
- How much money should I put into the company to keep it solvent, but not have more unused liquidity in it than my personal bank account can't handle? How about adding more money later on, especially when there are multiple founders? How to balance stake vs money put in?
- In case I suspect that my company will be bought at some point, should I have a holding to hold the actual company, e.g. to avoid the early payment of taxes?
- If I plan to have investors, do I need to take any preparations to easily enable parts of the company owned by third parties?
- Should I use the "Musterprotokoll" when founding my company or should I have a custom one made? Should I use a service provider like go-ahead or firma.de to found the company, or do everything on my own?
- Should I found have a German company, or an English, American, etc. one? Especially in software and SaaS, being some other place may be easier tax- and data protection-wise.
Dienstag, 27. August 2013
A quick overview of simple encryption for end-users
- gpgtools for mac, obtainable at https://gpgtools.org/
- gpg4win for windows, obtainable at http://www.gpg4win.org
- k-9 mail for android, obtainable at https://play.google.com/store/apps/details?id=com.fsck.k9
Surfing anonymously is a bit harder, and there is a choice of VPN providers. They all are centralized, i.e. there is one party that knows everything you do. There are a few peer to peer solutions, and the best and probably most famous on is the TOR project.
The Tor project offers a a lot and is actually used in China, Egypt and other Arabic countries to enable and protect members of the opposition. There is an amazing talk "How governments have tried to block Tor" by one of the creators, describing the arms race with China and other countries on en-/disabling access to the TOR network - once a user is in, it is almost impossible to track his activities.
Dienstag, 30. Juli 2013
The Multi-linear Jigsaw of "Candidate Indistinguishability Obfuscation" for functional software encryption (possibly part 1)
The concept of obfuscating and its security:
The issue the paper contributes to is as follows: can one make a program, while preserving its functionality, make it unintelligible (so that looking at it doesn't give you a whole lot of information)? There is a result that roughly states that it is almost impossible to have an "encrypted simulation" of a program to obfuscate it, since it can be shown that one can construct efficient unobfuscater that extracts the information used to make the program unintelligible.A solution to this dilemma is given by a slightly different formulation of this problem, called the indistinguishability obfuscation. Instead of resorting to simple models of computations such as Turing machines, it uses circuits. Given a particular class of circuits $\mathcal{C}$, then an indistinguishability obfuscator $i\mathcal{O}$ should guarantee that for two equivalent ciruits $C_1$ and $C_2$ the two distributions $i\mathcal{O}(C_1)$ and $i\mathcal{O}(C_2)$ are computationally indistinguishable. It has been an open question, whether such an obfuscator exsists for all polynomial sized circuits.
The guarantee of the obfuscator $i \mathcal{O}$ is formalized by security parameters,; the cirucuit class of the obfuscator is actually $\{ C_\lambda \}$. For encoded circuit equipped with a security parameter, e.g. $i\mathcal{O}(\lambda, C)$, there is for any distinguisher $D$ a negligible function $\alpha$ and it holds: for all obfuscations $C_1, C_2$ of the same circuit $C \in \mathcal{C}_\lambda$, the probability of $D$ distinguishing it is bounded by $\alpha(\lambda)$.
The computation model:
The paper proposes an indistinguishability obfuscator for circuits of the class $NC^1$, that is the class of all circuits that are polynomial in size, and have a logarithmic (specifically: $log^1 n$) depth; one can think of a polynomial number of parallel processors with each a logarithmic runtime. It is not known whether this class equals P, but it can be shown that $NC \subset P$ by sequentializing the computations. For another idea about the class: $AC = NC$, where $AC^i$ contains all boolean circuits with depth $O(log^i n)$ and polynomial number of AND/OR gates that admit an arbitrary number of inputs.
The neat feature of $NC^1$ used for the later construction is giving by Barrington's theorem: Every circuit $C$ of $NC^1$ can be transformed into a collection of $k$ square matrices $M^0_1, \ldots M^0_k$ and $M^1_1, \ldots M^1_k$ and model the computation of the circuit, given input $x$ of length $\ell$ by a matrix product $C(x) = 0 \Longleftrightarrow \prod_{i=1}^k M_i^{x f(i)} = I$ with a pre-defined $f: [k] \rightarrow [\ell ]$.
The core construction:
Besides the need for additional security, the matrix computation will be encoded into $k$ groups $G_i$: if an entry of a matrix $M^0_i$ is $\alpha$, e.g. at $(1,1)$, it is encoded as $g_i^\alpha$. Using a multilinear map, the product can be represented in a single group $G_T$.
The ominous multi-linear jigsaw puzzle is described as follows: given a mutilinear map system over groups of prime order p $e: G_1 \times \ldots \times G_k \rightarrow G_T$, a valid multinear form is given by any expression involving operations in $G_T$, the multilinear form and the operations within each group for the parameters of the multilinear map. The papers gives a sample expression for the case of $k = 3$ as $w_1 \cdot e(x_, y_3, z_1 z_2)^2 \cdot e(x_2^3 x_4, y_1^2, z_2 z_5^3) \cdot w^2_3)$, where $x_i \in G_1$, $y_i \in G_2$ and so an, $w_i \in G_T$.
Group elements $g$ then can be viewed as puzzle pieces for the puzzle given by such an expression. A solution is given when the expression yields a unit $g_T \in G_$. The puzzle thus has a generator outputting system parameters $prms$ and some nonempty sets of elements $S_i = \{ x_1^{(i)}, \ldots \} \subset G_i$ and the validator inputs $(prms, S_1, \ldots, S_k, S_T, \prod)$ where the last tuple element is a valid multilinear form. It gives a positiv answer if the form $\prod$ with the given elements gives a unit element of $S_T$.
In order to prove security, the paper makes hardness assumptions of the form: for $g_1, \ldots, g_T$ generators for $G_I$ such that for a form $e$ with $e(g_1, \ldots) = g_T$ it is impossible to distinguish tuples of expotential products of the generators.The obfuscation scheme:
This construction then is used in a Witness encryption like scheme:
We observe here an analogy to witness indistinguishable proofs: if a statement being proven only has a unique witness, then a witness-indistinguishable proof does not need to hide the witness. The way witness indistinguishability can be used is by explicitly constructing statements that can have multiple witnesses. Similarly, we will use indistinguishability obfuscation by constructing circuits that inherently have multiple equivalent forms. We use this analogy to build our main application of functional encryption.This is used as
Given an $NP$ Language $L$, a witness encryption scheme for $L$ is an encryption scheme that takes as input an instance $x$ and a message bit $b$, and outputs a ciphertext $c$. If $x \in L$ and $w$ is a valid witness for $x$, then a decryptor can use $w$ to decrypt $c$ and recover $b$. However, if $x \not \in L$, then an encryption of 0 should be computationally indistinguishable from an encryption of 1.The Indistinguishablity Obfuscation for $NC^1$ implies Witness encryption for an $NP$-Complete language. The takes proceeds to argue for $L=SATISFIABILITY$, using the function $F_{x,b}(w)$ defined by:
- $F_{x,b}(w) = b$ if w is a valid witness for x
- $F_{x,b}(w) = \perp$
The remainder:
This finishes up this post, being possibly part 1, since the paper continues:
- to solve the open problem by lifting the $NC^1$ construction to obtain a construction for all polynomial sized circuits
- provide functional encryption schemes for those circuits
- obtains a "meaningful" simulation based security for function encryption
- provides security proofs against several classes of attacks
The actual paper is at http://eprint.iacr.org/2013/451
Montag, 29. Juli 2013
Categorial constructions (Part 1)
Reading up on a paper, I discovered that I will need Day's convolution to construct tensors in a bicategory of monoidal supported profunctors. The hom-categories of these to $Set$ then capture supported pre-categories as a partial monoid.
So slowly working toward it, I flesh out some of my notes here:Usually, one encounters convolution in electrical engineering and in image processing; the convolution of two functions $f$ and $g$, denoted by the convolution operator $*$ as $(f * g)$ is defined as
\[ (f*g)(t) = \int_{-\infty}^{\infty} f(\tau) f(t - \tau) d \tau \]
where $f, g: M \rightarrow \mathbb{C}$ are maps from a group to complex numbers. It can be thought of as a moving window, giving by one of the function, that inspects the values of the other function. This definition can be extended to convolutions on functors $F$ on a monoidal category $M$ to the category of sets $Set$, $F: M \rightarrow Set$.
In the standard setting, convolution yields a commutative algebra without identity on the linear space of (suitably) measureable functions. For the extension, the functor category $Set^M$ of functors $F: M \rightarrow Set$ yields a monoidal category; Day's convolution is its tensor product.
A simple example to see how it works is given by graded sets in a categorical setting: given the discrete category $\mathbb{N}$ with natural numbers as objects and only identify mappings as morphisms, where addition plays the role of the monoidal product, the graded sets then can be represented by functors $F: \mathbb{N} \rightarrow Set$. The convolution in the category then is
\[ (F*G)(n) = \sum_{i+j=n} F(i) \times G(j) = \sum_{i,j} F(i) \times G(j) \times hom_{\mathbb{N}(n, i \otimes j) \]
Where $\otimes$ is the addition of natural numbers. Moving on to a more general setting, one wants to replace $\mathbb{N}$ by a general monoidal category $C$. The formula for the convolution, given two presheaves $F, G: C^{op} \rightarrow Set$ is defined as
$ (F*G)(e) = \int^{c,d \in Obj(C)} F(c) \times G(d) \times hom_C(e, c \otimes d)$
In that expression, the convolution product is given by a coend $\int$ of the functor under the integral symbol. The coend of an functor $F$ in $Set$ is an object $e$ with a universal dinatural transformation $\zeta: F \stackrel{..}{\rightarrow} e$. In this case, the notion of dinatural transformation relaxes the requirements of a natural transformation: usually, for a natural transformation $\alpha: F \rightarrow G$ between functors F and G, $\alpha$ depends on some variable $x$ co-/contravariantly. A dinatural, or in this case the slightly more restrictive notion of an extranatural transformation requires either F or G to depend on some variable both co- and contravariantly. It is constituted by a collection of morphisms $\alpha_c: F(c,c) \rightarrow G(c,c)$ such that for every morphism $f: c \rightarrow c'$ in $C$ the hexagon identity holds:
$G(c,f)\alpha_cF(f,c) = G(f,c')\alpha_{c'} F(c',f):F(c',c) \rightarrow G(c,c')$
In this setting, the coend object is an object of $Set$. The Day's convolution has some nice properties regarding the preservation of colimits, as well as some interesting relations to the Yoneda embedding. The next posts will contain more on the properties on the convolution and possibly the Yoneda lemma, but also mainly focus on recovering supported precategories in a more abstract setting.Samstag, 27. Juli 2013
Mahout
Montag, 29. April 2013
Writing Systems; Miss Korea and facial analysis
On another note, there is an interesting (mathematical) analysis of the top 20 Miss Korea contestants and the similarity of their faces.
Montag, 22. April 2013
Auctioneers and High Performance
Samstag, 6. April 2013
Quick permutations
import itertools partb= list(itertools.imap(lambda x: "".join(x), itertools.combinations('aApe',3))) parta = list(itertools.imap(lambda x: "".join(x), itertools.permutations('gRrAet!1',5))) print '\n'.join(list(itertools.imap(lambda x: "".join(x), (itertools.product(parta, partb)))))But in the end, this post is just to test out the syntax highlighting feature in javascript.