ACQUIRING AUDIO DATA Obtaining or collecting of sounds. And there are many formats in acquiring of audio.
METHOD OF ACQUIRING. COLLECT DATA SIGNALING PROCESS PROCESSED BY CS USERS CAN BE USED
FORMATS OF AUDIO.
TYPES OF AUDIO. 1. LOSSY AUDIO : - loss of data and quality. 2. LOSSLESS AUDIO : - original audio copy.
TYPES WITH EXAMPLES.
LOSSY AUDIO : EXAMPLES : 1. MP3 –Developed by MPEG(Moving pictures experts group). - Compressed lossy audio format . - Max 320 kpbs. Smaller file size , faster file transfer , less space.
LOSSY AUDIO : EXAMPLES : 2.WMA – Designed by MS. - Compressed lossy audio format . - consumes less space.
LOSSLESS AUDIO : EXAMPLES : 1. WAV – Original , uncompressed , over space. - designed by MS. 2. FLAC – Designed by Xiph.org. - Royality free music license , open audio format.
OTHER FORMATS :
Data Storage Hardware
Bits and bytes A computer only understands the numbers or 1 , or whether a switch is on or off . We call those 1’s and 0’s ‘ bits ’ – binary digits . A byte (made up of 8 bits) is enough computer memory to store a single character of data (e.g. the letter F). The computer uses a code to understand what each bit pattern means. Using the ASCII code, for instance, the letter F is 70 and has a bit pattern of 01000110.
A merican S tandard C ode for I nformation I nterchange (askee) is a code which represents English characters as numbers. Each letter is assigned a number. For example, A = 65. ASCII Most computers use ASCII codes. This makes it possible to transfer data from one computer to another by changing the ASCII code into a binary pattern. ASCII for Capital Letters 65 A 78 N 66 B 79 O 67 C 80 P 68 D 81 Q 69 E 82 R 70 F 83 S 71 G 84 T 72 H 85 U 73 I 86 V 74 J 87 W 75 K 88 X 76 L 89 Y 77 M 90 Z
ASCII for Capital Letters 65 A 78 N 66 B 79 O 67 C 80 P 68 D 81 Q 69 E 82 R 70 F 83 S 71 G 84 T 72 H 85 U 73 I 86 V 74 J 87 W 75 K 88 X 76 L 89 Y 77 M 90 Z
Decimal and binary When we write numbers in the decimal system, we write them in columns. Each column is 10 times bigger than the one before (right to left). So 1010 is 1000 plus 10 = 1010. Tens of millions millions 100000s 10000s 1000s 100s 10s 1s 1 1 11010 would be 10000 plus 1000 plus 10 = 11010. Tens of millions millions 100000s 10000s 1000s 100s 10s 1s 1 1 1
In the binary system, everything is based on 2s, not 10s, so each column is twice as big as the one before. So 1010 in binary is 8 plus 2 = 10. 128s 64s 32s 16s 8s 4s 2s 1s 1 1 11010 would be 16 plus 8 plus 2 = 26. 128s 64s 32s 16s 8s 4s 2s 1s 1 1 1
Binary patterns Using the binary system, convert the ASCII code into the binary pattern. The first row has been completed for you.
All computer data is stored in binary form. This does not only include text but images, sounds and movies as well. The more complex the data the more memory is used to store it. I take up more space than you.
The amount of data stored is measured in kilobytes ( KB ). 1 megabyte ( MB ) is 1,000 KB ( 2 20 ) 1 gigabyte ( GB ) is 1,000 MB ( 2 30 ) 1 terabyte ( TB ) is 1,000 GB ( 2 40 ). Confusingly, 1KB is actually 1,024 bytes ( 210 ), not 1,000 as you might expect, but most people think in multiples of 1,000.
When a computer is first switched on, it needs to load up the BIOS (Basic Input/Output System) and basic instructions for the hardware. These are stored in ROM ( R ead O nly M emory). This type of memory is called non-volatile because it retains the data . Data stored in ROM remains there even when the computer is switched off. ROM can be found on the motherboard. Read Only Memory (ROM)
When a computer is started operating instructions, any computer programs that are opened and data are stored in the RAM ( R andom A ccess M emory) temporarily . When the computer is switched off all the data is cleared from the RAM. This type of memory is called volatile because it only stores the data while the computer is switched on . RAM sticks are found on the motherboard. The contents of RAM are constantly rewritten as the data is processed. Random Access Memory (RAM)
There are three types of storage device : Storage devices and media The medium is what the data is actually stored on. Examples of media include floppy disks , CD-ROM and zip disks . those that store data by magnetizing a special material that coats the surface of a disk, and others that store data using optical technology to etch the data onto a plastic-coated metal disk. Laser beams are then passed over the surface to read the data. Finally, there are solid state devices, such as memory sticks. We will look at these in more detail later on.
Fixed storage
Removable Storage
Summary Data is stored using binary code (0 and 1). Computer memory is measured in kilobytes . Read Only Memory ( ROM ) is non-volatile because it cannot be changed . Random Access Memory ( RAM ) is volatile because it only works when the computer is switched on . There are two types of storage devices; those that use magnetic media and others that use optical media . Different types of media have different storage capacities . Storage devices can also be divided into those that are fixed and those that are removable .
28 What is an Image ? An image is a projection of a 3D scene into a 2D projection plane . An image can be defined as a 2 variable function f(x,y): R 2 →R , where for each position (x,y) in the projection plane, f(x,y) defines the light intensity at this point.
Image as a function 29
30 i(x,y) r(x,y) f(x,y)=i(x,y)⋅r(x,y) g(i,j) Image Acquisition pixel=picture element
Acquisition System 31 World Camera Digitizer Digital Image CMOS sensor
Image Types Three types of images: Binary images g(x,y) ∈ {0 , 1} Gray-scale images g(x,y) ∈ C typically c={0,…,255} Color Images three channels: g R (x,y)∈C g G (x,y)∈C g B (x,y)∈C 32
Notations Image Intensity - Light energy emitted from a unit area in the image Device dependence Image Brightness - The subjective appearance of a unit area in the image Context dependence Subjective Image Gray-Level - The relative intensity at each unit area Between the lowest intensity (Black value) and the highest intensity (White value) Device independent 35
Intensity vs. Brightness 36
37 Intensity Δ f 1 Δ f 2 f2 f1 f 1 < f 2 , Δ f 1 = Δ f 2 Equal intensity steps: Equal brightness steps: Intensity vs. Brightness
Weber Law Describe the relationship between the physical magnitudes of stimuli and the perceived intensity of the stimuli. In general, Δf needed for just noticeable difference (JND) over background f was found to satisfy: 38 Brightness ∝ log( f )
What about Color Space? JND in XYZ color space was measured by Wright and Pitt, and MacAdam in the thirties MacAdam ellipses: JND plotted at the CIE-xy diagram Conclusion: measuring perceptual distances in the cie-XYZ space is not a good idea
Perceptually Uniform Color Space Most common: CIE-L*a*b* (CIELAB) color space. L* represents luminance. a* represents the difference between green and red, and b* represents the difference between yellow and blue.
Perceptually Uniform Color Space XYZ to CIELAB conversion: where (X ,Y ,Z ) are the XYZ values of a reference white point
100 100 100 100 100 100 0 0 0 100 100 0 0 0 100 100 0 0 0 100 100 100 100 100 100 Digitization Two stages in the digitization process: Spatial sampling : Spatial domain Quantization : Gray level 42 f x y 1 2 3 4 5 1 2 3 4 5 f(x,y) Continuous Image Digital Image g(i,j) ∈ C j i
Spatial Sampling When a continuous scene is imaged on the sensor, the continuous image is divided into discrete elements - picture elements (pixels)
Spatial Sampling
x Sampling The density of the sampling denotes the separation capability of the resulting image Image resolution defines the finest details that are still visible by the image We use a cyclic pattern to test the separation capability of an image
Sampling Rate
1D Example: Nyquist Frequency Nyquist Rule : To observe details at frequency f (wavelength d) one must sample at frequency > 2f (sampling intervals < d/2) The Frequency 2f is the Nyquist Frequency . Aliasing : If the pattern wavelength is less than 2d erroneous patterns may be produced.
Aliasing - Moiré Patterns
Temporal Aliasing
Temporal Aliasing Example
Image De-mosaicing Can we do better than Nyquist?
Image De-mosaicing Basic idea: use correlations between color bands
Quantization Choose number of gray levels (according to number of assigned bits) Divide continuous range of intensity values
Quantization
Quantization 8 bits image 4 bits image Low freq. areas are more sensitive to quantization
10 20 30 40 50 60 70 80 90 100 2 4 6 8 10 How should we quantize an image? Simplest approach: uniform quantization Gray-Level Sensor Voltage Z Z 1 Z 2 Z 3 Z 4 Z k-1 Z k . . . . q q 1 q 2 q 3 q k-1 . . . . . . . . sensor voltage quantization level
Non-uniform Quantization Quantize according to visual sensitivity (Weber’s Law) Non uniform sensor voltage distribution Z 7 Z 6 Z 5 Z 4 Z 3 Z 1 Z Z 2 q 6 q 5 q 3 q 2 q q 4 q 1 Low Visual Sensitivity High Visual Sensitivity
Optimal Quantization (Lloyd-Max) Content dependant Minimize quantization error q q 1 q 2 q 3 sensor voltage quantization level Z Z 1 Z 2 Z 3 Z 4
Optimal Quantization (Lloyd-Max) Also known as Loyd-Max quantizer Denote P(z) the probability of sensor voltage The quantization error is : Solution: Iterate until convergence (but optimal minimum is not guaranteed).
Common color resolution for high quality images is 256 levels for each Red , Greed , Blue channels, or 256 3 = 16777216 colors. How can an image be displayed with fewer colors than it contains? Select a subset of colors (the colormap or pallet) and map the rest of the colors to them. from: Daniel Cohen-Or Color Quantization
With 8 bits per pixel and color look up table we can display at most 256 distinct colors at a time. To do that we need to choose an appropriate set of representative colors and map the image into these colors from: Daniel Cohen-Or 126 14 111 36 36 111 36 111 5 12 12 17 17 111 14 126 17 36 12 111 36 36 200 12 14 126 17 36 36 111 12 14 36 36 200 12 Color Quantization
from: Daniel Cohen-Or 2 colors 256 colors 16 colors 4 colors Color Quantization
Color Quantization Naïve (uniform) Color Quantization 24 bit to 8 bit: Retaining 3-3-2 most significant bits of the R,G and B components. false contours from: Daniel Cohen-Or
Median Cut R G B
Median Cut from: Daniel Cohen-Or
Median Cut from: Daniel Cohen-Or
Median Cut from: Daniel Cohen-Or
Median Cut from: Daniel Cohen-Or
Median Cut from: Daniel Cohen-Or
Color_MedCut (Image, n){ For each pixel in Image with color C, map C in RGB space; B = {RGB space}; While (n-- > 0) { L = Heaviest (B); Split L into L1 and L2; Remove L from B, and add L1 and L2 instead; } For all boxes in B do assign a representative (color centroid); For each pixel in Image do map to one of the representatives; } The median cut algorithm from: Daniel Cohen-Or
Better Solution from: Daniel Cohen-Or
Generalized Lloyed Algorithm (GLA) p i from: Daniel Cohen-Or
Generalized Lloyed Algorithm (GLA) p i from: Daniel Cohen-Or
Generalized Lloyed Algorithm (GLA) p i from: Daniel Cohen-Or
Color_GLloyd(Image, K) { - Guess K cluster centre locations - Repeat until convergence { - For each data point finds out which centre it’s closest to - For each centre finds the centroid of the points it owns - Set a new set of cluster centre locations - optional: split clusters with high variance } } The GLA algorithms aims at minimizing the quantization error:
8 bit 4 bit 24 bit from: Daniel Cohen-Or
More on Color Quantization Observation 1 : Distances and quantization errors measured in RGB space, do not relate to human perception. Solution : Apply quantization in perceptually uniform color space (such as CIELAB).
More on Color Quantization Original RGB Quantization Lab Quantization
More on Color Quantization Observation 2 : Quantization errors are spatially dependent: we are more sensitive to errors at lower spatial frequencies. 1 3 10 30 100 Sensitivity Spatial Frequency
More on Color Quantization Solution : Assign weight for each pixel color Using this scheme we minimize: 50 100 150 200 250 100 200 300 50 100 150 200 250 W w W w W W
Original Standard quantization Weighted quantization