Page 56 - Nanotechnology an introduction
P. 56
One very important physical instantiation of one-dimensional texture is the sequences of nucleic acids that encode proteins and also constitute the
binding sites (promoters) for transcription factors—proteins that bind to DNA as a prerequisite for the transcription of the DNA into RNA that
precedes the translation of the RNA into amino acid sequences (proteins). The nucleic acids have a variety of four (the bases A, U, C, G in natural
RNA, and A, T, C, G in DNA). Many statistical investigations of intragenome DNA correlations group the four into purines (A and G) and
pyrimidines (U or T and C)—rather like analyzing texts. Mere enumeration of the bases is of course inadequate to characterize texture.
Algorithmic Information Content
Irregularity can be quantified by estimating the algorithmic information content (AIC, also called algorithmic or Kolmogorov complexity), which is
essentially a formalization of the notion of estimating the complexity of an object from the length of a description of it. The first task is to encode the
measured variegation in some standard form. The choice of rules used to accomplish the encoding will depend on the particular problem at hand.
For the one-dimensional nucleic acid textures referred to in the preceding paragraph, encoding purines as 1 and pyrimidines as 0 may be
sufficient, for example. The formal definition of the AIC of a symbolic string s (encoding the object being described) is “the length of the smallest
(shortest) program P that will cause the standard universal computer (a Turing machine T) to print out the symbolic string and then halt”.
Symbolically (but only exact for infinite strings), denoting the AIC by K,
(5.16)
where |P| is the length of the program (in bits) and C is the result of running the program on a Turing machine. Any regularity present within the
T
string will enable the description to be shortened. The determination of AIC is therefore essentially one of pattern recognition, which works by
comparing the unknown object with known prototypes (there is no algorithm for discovering patterns de novo, although clusters may be established
according to the distances between features). The maximum value of the AIC (the unconditional complexity) is equal to the length of the string in the
absence of any internal correlations; that is, considering the string as random, viz.,
(5.17)
Any regularities, i.e. constraints in the choice of successive symbols, will diminish the value of K from K max .
Effective Complexity
The concept of effective complexity (EC) [61] was introduced in an effort to overcome the problem of AIC increasing monotonically with increasing
randomness. EC is defined as the length of a concise description (which can be computed in the same way as the AIC) of the set of regularities of
the description. A very regular symbolic sequence will have only a small number of different regularities, and therefore a short description of them; a
random sequence will have no regularities, and therefore an even shorter description. There will be some intermediate descriptions with many
different regularities, which will yield a large EC. Essentially,
(5.18)
where RIC is the random information content. In a certain sense, EC is actually a measure of our knowledge about the object being described, for it
quantifies the extent to which the object is regular (nonrandom), and hence predictable. It presents the same technical difficulty as AIC: that of
finding the regularities, both in compiling an initial list of them, and then in finding the regularities of the regularities.
5.4.3. Two-Dimensional Texture: Lacunarity
Whereas AIC and EC can be straightforwardly applied to linear texture, it is not generally obvious how the two-dimensional pattern should be
encoded. Of course it could be mapped in raster fashion (as is actually done in SPM and SEM), and patterns extending over many lines should
appear as regularities.
Another approach to capture information about the spacial correlations of arbitrarily heterogeneous real surfaces is to extend the fractional
dimension or fractal representation of roughness to encompass the quantification of voids in rough objects (the lacunarity Λ). Consider an image
constructed from binary (black or white, corresponding to values of 0 and 1) pixels, and let the numbers of boxes of side r containing s white pixels
have the distribution n(s, r). Λ(r) is defined as
(5.19)
where M and M are the first and second moments of the distribution,
1
2
(5.20)
and
(5.21)
where 〈s〉 and are, respectively, the mean and variance of the distribution and the total number for a square pattern of size M
of boxes of size r (i.e., a type of variance-to-mean ratio). The lacunarity can be thought of as the deviation of a fractal from translational invariance
[60], or a scale-dependent measure of heterogeneity (i.e., “texture”) of objects in general [4]. Its lowest possible value is 1, corresponding to a
translationally invariant pattern (including the special case Λ(M) = 1). Practically, comparison of the lacunarity plots (i.e., a log-log plot of the function
Λ(r)) of artificially generated patterns with the experimental lacunarity may be used to analyze the texture of the sample.
5.5. Metrology of the Nano/Bio Interface
Here we come to the most difficult challenge for the nanometrologist. Whereas the techniques discussed in the preceding sections are closely