SignalProcessingAndInformationTheory
This is a class for the Bachelor of Science in Bioinformatics. The course is in English, so this page is in that language. If you was looking for the page about AY 21-22, it's there.
Exams update
The process of correcting the assessment tests carried out in the classroom and at home has been concluded, here you can find the text of the final test, together with its unrolling
A proposal of vote has already been sent to those who have attended, on the basis of the two intermediate tests, which can be followed (on request) by an interview concerning the same topics as the tests
For those who have not attended, we could carry out a short part by writing, followed by an interview on the topics covered, even in the intermediate tests. But contact me for any doubt or question
Schedule, location and links
- Tuesday and Friday from 11:00 until 13:00 (or 14:00 on Tuesday)
- Classroom Psicologia II, Fisiologia Generale e Antropologia Farmacia e Medicina (CU026, E01PS1L084)
- Google classroom link, to be used as a communication and storage tool
- code esr6xyn - please subscribe with this code
- link https://classroom.google.com/c/NDc1NzU1NDMyMTk5?cjc=esr6xyn
- there you can find the full course description, the program, the bibliography, and the link to the ongoing translation of the reference book
- welcome to the A.Y. 22/23 class, the more recent update about the course is in this PDF.
Course summary
- Download the ZIP file containing all slide sets developed during the teaching - also in a four pages per page format
- Below is a table with links to the material screened in class during the year; the progresses made are reported further down
Topic | keywords | slide |
Course Overview | Things we will talk about | |
---|---|---|
Signals and Systems | Analog and numeric signals, Communications point of view, Signal properties, classes and operations: Frequently used signals, Complex numbers and exponential; Linear and time invariant system, non-linear systems | |
Fourier series and signals space | phasors, Fourier series, Parseval’s theorem, exponentials orthogonality, signal spaces, basis of representation, inner product, Schwartz inequality | |
Fourier transform and convolution | Cross energy, Parseval's theorem, Fourier transform and its properties, Dirac impulse, Impulse response, Convolution, Filtering, Windowing | |
Sampling and digital signal processing | Sampling, pulse train, aliasing, oversampling, decimation, interpolation, sample and hold, A/D and D/A conversion, uniform quantization, Discrete time Fourier transform, DFT, zeta transform, discrete and circular convolution, convolution via DFT, overlap and add | |
Signal Processing in Bioinformatics | Spectral analysis of the genome, Filtering of the genome, Other representation spaces, Numerical representation of codons, Long range DNA correlation, Fourier Transforms of Protein Sequences, Fourier transform infrared spectroscopy (FTIR) | |
Filters | Analog filters, polinomials, filter classes and design template, digital filters, Finite impulse response or FIR, First order infinite impulse response (IIR) filter, FIR synthesis starting from the continuous time description, Zeta transform and filtering, Synthesis of an IIR filter starting from an analog filter | |
Random signals and Wiener's theorem | Stationary and ergodic processes, Correlation and covariance for signals, autocorrelation and intercorrelation, geometry and adaptation, power density spectrum, Wiener’s theorem, multidimensional Gaussian and process, Spectral estimation, spectral density at the output of a filter, statistical characteristics at the output of a filter, sum and product of random and deterministic signals | |
Information Theory and biochemical applicartions | Information content of a discrete memoryless source, entropy, continuous sources, joint and conditional entropy, average mutual information, Discrete channel capacity, Distinguishing the type of biological signal, capacity, bottlenecks, measurements, bias, and allowable distortion | |
Intermediate tests | 2022: 1^{st}, 2^{nd}, 3^{rd}; 2023: 1^{st}, 2^{nd} |
Study material
Consists mainly of the slides projected in class, to add to
- my book in the original or translated version (please report errors!)
- the material linked in the classroom pages
- other material proposed from time to time in the classroom
- during the new cycle of lessons I'm updating the slides, so make sure you read the latest version
Progress of individual lessons
In addition to summarizing the topics covered from time to time, references to the support material used in class are also provided
- tue 7/3, fry 10/3 - Course Overview: After the illustration of the reference telematic tools for the course, we introduce ourselves. A brief introduction to the nature of signals and their processing follows, in order to provide an overview of the objects we will be dealing with. In accordance with what I have proposed to myself, some applicative aspects of signal processing and information theory in the biological, medical, genetic, microscopic and structural fields are then outlined, up to a mention of the most recent applications. But, really, don't be afraid of that, I just wanted to pique your curiosity to study signal processing, at least to understand anything you still don't understand about more advanced topics!
- link to the 1^{st} slide set Course Overview
- some waveform and spectral analysis tools, such as Audacity and Dfilter
- tue 14/3 - Signal and Systems: This series of slides covers the contents of chapter 1 in the Signal Basics book. After a brief introduction on the (co)domains of signals, on the family of Fourier analyzes and on the relationships between impulse response, convolution, frequency response and filtering, different classes of signals are defined, on the basis of their behavior over time and asymptotic, and some notes on common operations performed on signals, their combination and on graphs are provided. Then the more commonly used signals are illustrated.
- link to the 2nd slide set Signals and Systems
- some math animations (1) (2) (3) (4)
- sine and cosine as the result of uniform circular motion
- fri 17/3 - Signal and Systems 2: a complex algebra review is provided and an in-depth analysis is undertaken on the role of the complex exponential in frequency analysis. Since signal processing is typically implemented as the transit of the signal itself through a linear operator, the characteristics of the latter are explained, and therefore the differences with respect to non-linear systems are highlighted.
- Do you want to see the conformal map describing the result y=1/x? Follow the first link on this page and remember that 1/x = a/(a^{2}+b^{2}) - jb/(a^{2}+b^{2})
- Are you wondering what is so special about the number e that it is the only one that gives rise to Euler's formula? Here is a discussion on the subject
- tue 21/3 - Fourier series and signal spaces: After an introduction to phasors notation, the representation of periodic signals as an ordered set of Fourier coefficients is given, together with their reconstruction formula known as Fourier series. Then the alternative representations for real signals are given, as well as the effects of using only a limited set of coefficients. This section closes with the proof of the Parseval's theorem which gives a typical effect of the exponentials orthogonality property. Then we deal with the concepts of vector algebra when applied to signal spaces and, as a way to motivate the Fourier series validity, beginning to talk about the basis of representation of a vector space and the norm of a vector
- link to the 3^{rd} slide set Fourier series and signal spaces
- Go visualize the Fourier coefficients for some typical periodic waveforms, and experiment by changing the number of coefficients. Have you wondered why the Gibbs horns arise just around the discontinuity?
- fry 24/3 - Fourier series and signal spaces 2: After deepening the concepts of dimension of a vector space and of linear independence for its basis of representation, the concept of norm of a vector is introduced, followed by that of scalar product, and of the norm that it induces, by virtue of the existence of the Schwartz inequality which allows to define an angle between vectors. Therefore, having an orthonormal basis, the scalar product between vectors can be calculated on the basis of the respective coefficients, as happens in a Euclidean space, obtaining an identical definition of distance. These concepts are then applied to the spaces of periodic, energy and power signals, after having defined for them the formula that evaluates the inner product, and showing how many of the properties valid for signals are correlated in a simple way with the particularities of metric spaces. Furthermore, its shown how any linear transformation or operator can be thought of as the result of evaluating an inner product, so that many signal processing results such as Fourier transform, convolution, signal correlation, fall into this case.
- A new free culture book: Signal Processing for Communications - in chapter 3 you can find a good exposition about vector and signal spaces
- tue 28/2 - Fourier transform and convolution: The extension of Fourier series analysis to non-periodic signals is introduced, the Fourier transform defined, as well as its anti-transform, and the relationship in between the Fourier series and the transform of a single period is illustrated. The concepts of cross energy, orthogonality and Schwartz's inequality are stated, as well as Parseval’s theorem for energy signals; thus, a first list of properties for the Fourier transform is proved. The definition of the Dirac allows to obtain an expression for the transform of constant and periodic signals, to define the impulse response of a physical system and the convolution integral
- link to the 4^{th} slide set Fourier transform and convolution
- Test on the Falstad simulator how a periodic wave changes shape by altering only the phase of its Fourier coefficients
- fry 31/3 - Fourier transform and convolution - 2: A graphical interpretation of the convolution integral evaluation was illustrated, showing the memory function realized by the impulse response, with many interactive tools introduced. Equivalence between a product in time and a convolution in frequency (and vice versa) is asserted, thus opening an alternative path to perform signal processing operations, such as to measure the frequency response instead of h(t). The concepts of linear phase, delay and cascade systems have been revised. But instead of analyzing a filter, we can also synthesize it, as discussed thanks to Falstad's tool. Finally, the transform of a triangle is evaluated.
- Experiment yourself with the convolution animations whose links are on page 34 of the 4^{th} slide set
- Experiment yourself with the Falstad's filter synthesis tool
- tue 4/4 - Fourier transform and convolution - 3: The effects of the multiplication between signals either in time or in frequency are analyzed, arriving at filters and frequency response, as well as modulation, and windowing. Finally the pulse train signal is defined, and the related transform is evaluated. Sampling and digital signal processing - 1: The theory that allows the transition from the analog to the digital domain is illustrated, showing how a limited bandwidth signal can be fully described by knowing only its samples, provided that they are taken at a sufficiently high rate. Insights are given on the consequences of non-compliance with the requirements, as well as the measures necessary to guarantee them are described, without incurring excessive implementation complexity.
- In case you've never heard of it, check out the meaning of an integral function
- link to the 5^{th} slide set Sampling and digital signal processing
- Check the effect of aliasing
- fri 14/4 - Sampling and digital signal processing - 2: Other implementation aspects are addressed such as A/D conversion, sample quantization and associated error magnitude, and digital-to-analog restitution. Everything is now ready to discuss the different types of Fourier analysis for the digital domain, such as DTFT for indefinite sequences, or DFT and its inverse in the temporally limited (or periodic) case. In the last hour a review of the topics is carried out, in view of the intermediate test
- tue 18/4 - Intermediate test - Here you can find the test and execution
- fri 21/4 - Sampling and digital signal processing - 3 - After describing some programming resources, calculating the DTFT as a zeta-transform value on the unit circle allows us to express the positive-frequency bandwidth of the spectrum as π as well as fc/2, or N/2 for the points of DFT, or 1/2 as normalized frequency.
- fry 28/4 - Signal Processing in Bioinformatics - The frequency analysis of the genome is introduced, with the aim of detecting DNA coding regions within the ORFs, which determines a repetitive patterns whose main contribution is given by the length (3) of the codons. Hence, numerical representations are also provided for codon sequences. But another spectral feature of the genome is its long-range correlation, corresponding to a slope 1/f of its overall spectrum. Finally, the spectral analysis of the amino acid sequences of proteins is made possible using their EIIP. allowing to classify proteins according to their functional role
- link to the 6^{th} slide set Signal Processing in Bioinformatics
- tue 2/5 - Today we were only two people! End of Sampling and digital signal processing - 4 - Considerations about DFT as a filterbank led us to an introduction filtering by batch processing, and then we started the new slide set: Filters - The analysis of analog filters is first briefly introduced, since the terminology is still valid to describe different classes of numeric filters, and the result of their synthesis technique is converted into the zeta transform of the transfer function used to describe a numeric filterDigital filters are a class of filters operating according to a computational architecture rather than a circuit one, and can represent the effect of natural causes as well as being defined by man. The transversal filter or FIR is then introduced, both from the point of view of analysis and synthesis, highlighting some implementation aspects
- link to the 7^{th} slide set Filters
- fri 5/5 - Filters - 2 - The squared frequency response of a first order FIR is discussed in more detail, as it is the basis of how FT Infrared Spectrometry works. We then return to the first set of slides, to discuss in depth the operating principles of an FTIR. Back again to the filter chapter, the first order IIR is discussed together with its applications. At this point, numerical filters are illustrated, describing the synthesis of the coefficients for an FIR filter, as well as the description of a numerical filter using finite differences, arriving to describe its transfer function in terms of zeta transform, and to represent its canonical architecture. Finally, the design of numerical filters obtained by applying a change of variable to the transfer function resulting from the design of analog filters is mentioned
- tue 9/5 - Random signals and Wiener's theorem - If a signal is not known in advance we cannot perform any kind of Fourier spectral analysis, and we have to resort to a probabilistic description of it. After a brief review of the concepts of probability and statistics, the definition of a random process is introduced and how to compute its time ensemble averages is discussed. The conditions for classifying a process as stationary and ergodic are then explained. After some examples on the equivalence between time and ensemble averages, a 2D random variable is extracted from the process, and the correlation and covariance between its components is discussed, both from a probabilistic and statistical point of view. The autocorrelation function defined for both random processes and deterministic signals is then introduced, as well as the intercorrelation between different signals, and links to convolution are clarified
- link to the 8^{th} slide set Random signals and Wiener's theorem
- link to the Octave simulation on autocorrelation and more, link to the upper directory were you can download script.zip for the whole set of codes
- OPIS survey - I received the following communication: invite students to fill in the OPIS questionnaire during the lesson, providing them with the OPIS code and dedicating 15 minutes of the lesson to filling it out. OPIS code: C91FNG47 - vademecum per studenti OPIS 2022-2023, Dedicated instructions are available for courses in English
- fri 12/5 - Random signals and Wiener's theorem - 2 - The properties of autocorrelation are enunciated, and a few words are spent on the link between Pearson's coefficient and Schwartz's inequality, just as the matched filter can also be thought of in this way. Wiener's theorem is a valuable tool for obtaining the spectral density for any type of signal, be it deterministic or random, as well as for opening an alternative path towards already known results. Examples of its application are given...
- It is proposed to carry out the second intermediate assessment test on 30 May, the last day of class, so as not to overlap with the dates of early June, when exams for other courses are scheduled. On the other hand, we choose not to deal with the subject of signals on graphs, so as to be able to develop the remaining topics (filtering and combination of processes, information theory) with all the necessary attention
- tue 16/5 - Random signals and Wiener's theorem - 3 - Particular emphasis is given to the multivariate Gaussian and to its peculiar characteristics, as well as to its suitability to express the statistics of the samples of an ergodic Gaussian process. Spectral estimation by means of a periodogram is also explored. Finally, the issues of random signals and filtering are jointly addressed: in fact, the expression of the spectral density at the filter output is considered, and then its statistical description is evaluated. The set of slides ends by analyzing the result of adding or multiplying random processes
- fri 19/5 - Information theory - After having identified the two entities involved in the transmission of information as source and channel, the logarithmic link between information and probability is highlighted, and the measure of entropy is defined as an ensemble average. The entropy bounds for a binary or L-ary source, without and with memory, are then identified, and highlighted how the entropy value found represents the minimum possible coding speed. An example of a limited memory Markov source allows us to introduce the concept of conditional entropy. We then move on to continuous sources for which the differential entropy has particular aspects, which make it suitable for comparison between different sources but having the same variance. It is then shown how the Gaussian source is the most informative one, and for this reason it is the preferred model in cases of partial knowledge. A very brief introduction to the topic of source coding, both without and with loss of information, concludes the first part
- Link to the 9^{th} slide set - Information theory
- Link to on-line entropy calculators for text content: [ 1 ], [ 2 ]
- tue 23/5 - Information theory - 2 - Now is the time to operate on two different random variables, and define for them the joint entropy and two conditional entropies, as well as the more useful average mutual information I(X;Y), symmetric with respect to the variables, which can be evaluated by combining different types of entropy, and which leads to the definition of channel equivocation and noise entropy. The section concludes with the definition of channel capacity and related properties. The last part deals with the application of information theory concepts to biochemical signaling systems, showing how the variability of chemical reactions can make the biological signal ambiguous, and cause erratic cellular behavior. The symmetry of I(X;Y) has very useful consequences, but its value can be limited due to a small number of meanings for the signal.
- fri 26/5 - Information theory - 3 - The slide set concludes with some comments on how to measure the capacity of biological channels and how to apply the data processing inequality, bias, and rate-distortion theory to biochemical signaling systems
- tue 30/5 - Final intermediate test - Here you can find the text of the test, complete with its unrolling