acss {acss}  R Documentation 
Functions to obtain the algorithmic complexity for short strings, an approximation of the Kolmogorov Complexity of a short string using the coding theorem method.
acss(string, alphabet = 9) local_complexity(string, alphabet = 9, span = 5) likelihood_d(string, alphabet = 9) likelihood_ratio(string, alphabet = 9) prob_random(string, alphabet = 9, prior= 0.5)
string 

alphabet 

prior 

span 
size of substrings to be created from 
The algorithmic complexity is computed using the coding theorem method: For a given alphabet size (number of different symbols in a string), all possible or a large number of random samples of Turing machines (TM) with a given number of states (e.g., 5) and number of symbols corresponding to the alphabet size were simulated until they reached a halting state or failed to end. The outputs of the TMs at the halting states produces a distribution of strings known as the algorithmic probability of the strings. The algorithmic coding theorem (Levin, 1974) establishes the connection between the complexity of a string K(s) and its algorithmic probability D(s) as:
K(s) = log(D(s))
This package accesses a database containing data on 4.5 million strings from length 1 to 12 simulated on TMs with 2, 4, 5, 6, and 9 symbols.
For a more detailed discussion see Gauvrit, Singmann, SolerToscano, and Zenil (2014), http://complexitycalculator.com/methodology.html, or references below.
A matrix in which the rows correspond to the strings entered and the columns to the algorithmic complexity K and the algorithmic probability D of the string (see http://complexitycalculator.com/methodology.html).
A list with elements corresponding to the strings. Each list containes a named vector of algorithmic complexities (K) of all substrings in each string with length span.
A named vector with the likelihoods for string
given a detreministic process.
A named vector with the likelihood ratios (or Bayes factors) for string
given a random rather than detreministic process.
A named vector with the posterior probabilities that for a random process given the strings and the provided prior for being produced by a random process (default is 0.5, which correspond to a prior of 1  0.5 = 0.5 for a detereministic process).
The first time per session one of the functions described here is used, a relatively large dataset is loaded into memory which can take a considerable amount of time (> 10 seconds).
Delahaye, J.P., & Zenil, H. (2012). Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Applied Mathematics and Computation, 219(1), 6377. doi:10.1016/j.amc.2011.10.006
Gauvrit, N., Singmann, H., SolerToscano, F., & Zenil, H. (2014). Algorithmic complexity for psychology: A userfriendly implementation of the coding theorem method. arXiv:1409.4080 [cs, stat]. http://arxiv.org/abs/1409.4080.
Gauvrit, N., Zenil, H., Delahaye, J.P., & SolerToscano, F. (2014). Algorithmic complexity for short binary strings applied to psychology: a primer. Behavior Research Methods, 46(3), 732744. doi:10.3758/s1342801304160
Levin, L. A. (1974). Laws of information conservation (nongrowth) and aspects of the foundation of probability theory. Problemy Peredachi Informatsii, 10(3), 3035.
SolerToscano, F., Zenil, H., Delahaye, J.P., & Gauvrit, N. (2012). Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines. PLoS ONE, 9(5): e96223.
# WARNING: The first call to one of the functions # discussed on this page loads a large data set # and usually takes > 10 seconds. Stay patient. acss(c("HEHHEE", "GHHGGHGHH", "HSHSHHSHSS")) ## K.9 D.9 ## HEHHEE 23.38852 9.106564e08 ## GHHGGHGHH 33.50168 8.222205e11 ## HSHSHHSHSS 35.15241 2.618613e11 acss(c("HEHHEE", "GHHGGHGHH", "HSHSHHSHSS"))[,"K.9"] ## [1] 23.38852 33.50168 35.15241 acss(c("HEHHEE", "GHHGGHGHH", "HSHSHHSHSS"), alphabet = 2) ## K.2 D.2 ## HEHHEE 14.96921 3.117581e05 ## GHHGGHGHH 25.60208 1.963387e08 ## HSHSHHSHSS 26.90906 7.935321e09 acss(c("HEHHEE", "GHHGGHGHUE", "HSHSHHSHSS"), NULL) ## K.2 K.4 K.5 K.6 K.9 ## HEHHEE 14.96921 18.55227 19.70361 20.75762 23.38852 ## GHHGGHGHUE NA 31.75832 33.00795 34.27457 37.78935 ## HSHSHHSHSS 26.90906 29.37852 30.52566 31.76229 35.15241 ## D.2 D.4 D.5 D.6 ## HEHHEE 3.117581e05 2.601421e06 1.171176e06 5.640722e07 ## GHHGGHGHUE NA 2.752909e10 1.157755e10 4.812021e11 ## HSHSHHSHSS 7.935321e09 1.432793e09 6.469341e10 2.745360e10 ## D.9 ## HEHHEE 9.106564e08 ## GHHGGHGHUE 4.209915e12 ## HSHSHHSHSS 2.618613e11 ## Not run: likelihood_d(c("HTHTHTHT", "HTHHTHTT"), alphabet = 2) ## HTHTHTHT HTHHTHTT ## 0.010366951 0.003102718 likelihood_ratio(c("HTHTHTHT", "HTHHTHTT"), alphabet = 2) ## HTHTHTHT HTHHTHTT ## 0.3767983 1.2589769 prob_random(c("HTHTHTHT", "HTHHTHTT"), alphabet = 2) ## HTHTHTHT HTHHTHTT ## 0.2736772 0.5573217 ## End(Not run) local_complexity(c("01011010111" ,"GHHGGHGHUE"), alphabet = 5, span=5) ## $`01011010111` ## 01011 10110 01101 11010 10101 01011 10111 ## 16.22469 16.24766 16.24766 16.22469 16.24322 16.22469 15.93927 ## ## $GHHGGHGHUE ## GHHGG HHGGH HGGHG GGHGH GHGHU HGHUE ## 16.44639 16.44639 16.24766 16.22469 16.58986 16.86449 local_complexity(c("01011010111" ,"GHHGGHGHUE"), span=7) ## $`01011010111` ## 0101101 1011010 0110101 1101011 1010111 ## 26.52068 26.52068 26.47782 26.62371 26.29186 ## ## $GHHGGHGHUE ## GHHGGHG HHGGHGH HGGHGHU GGHGHUE ## 27.04623 26.86992 27.30871 27.84322