Musings on Math and Computing
In a previous post I investigated the effect of increasing the lengths of n-grams on information gain. There I demonstrated the fact that after a specific point increasing the length does not result in information gain. However, there were some unresolved questions. Today, I will resolve the remaining issues. [February 12]
In a previous post I implemented a word distance algorithm which is based on a graph created out of an English thesaurus. I then used that distance function to perform sentiment analysis. Today I will do the same for a Turkish thesaurus. [February 8]
A thesaurus is a dictionary which gives a list of somewhat equivalent words. A thesaurus path is a sequence of words
word_1, word_2, word_3, … , word_n
such that any two consecutive words appear in the list of entries of a third word, or of either word in a given thesaurus. I will define the thesaurus distance of a pair of words is the length of the shortest thesaurus path connecting one word to the other.
Today, I will implement the thesaurus distance in lisp.[February 1]
I made a curious observation in my previous post that there is a phase transition in the way the entropy changes as one increases the lengths of the n-grams in a text. Today I want to see if this transition is specific to the dataset at hand, or if it is an artifact of the method I used. [January 28]
The other day I was looking at a problem of determining how long a string one needs to use to identify a large text. As we increase the length of the substring, the chances of an accidental match reduces. However, there will be a sweet spot: after a point increasing the length of the substring will not give us an additional information. Today, I will try to figure out this breaking-point. [December 13]
The Moebius function is an important function which became the centerpiece for few recent conjectures: see here and here. Today, I will give an additively recursive definition of the Moebius function which does not require a factorization of the input into its prime factors, nor does it appeal to a modified version of the Sieve of Eratosthenes. I will also give an implementation in C. [September 16]
A description of my research
I do homological and homotopical algebra in the context of noncommutative geometry. You can find a detailed exposition of my past research, my present and future research interests in my research statement. Specifically I am interested in, Hopf equivariant cohomology theories, various flavors of Hochschild (co)homology, cyclic (co)homology and K-Theory. I am also intrested in abstract homotopical algebra, operads, PROPs and their algebras. For the visually inclined, I have a map of my slanted view of the noncommutative geometry landscape.
Applied Statistics and History
As an ongoing project with Prof. Boğaç Ergene at University of Vermont, we investigate social mobility patterns in 18th Century Ottoman Empire. While he provided the historical context and the analysis of the results we obtained, I did the necessary data processing and performed the statistical analyses needed for the particular data set Prof. Ergene painstakingly generated from historical sources and archives. We wrote three papers together so far, with more to come in the future.