User Tools

Site Tools


A PCRE internal error occured. This might be caused by a faulty plugin

===Quick links=== ==This wiki== * [[start|Main classwiki page]] * [[CLASS INFORMATION|Class Info]] * {{:Kuzma_Syllab_Spr2014_v4.pdf|Syllabus}} * [[FREQUENTLY ASKED QUESTIONS|FAQ]] * [[CLASS MATERIALS|Class materials]] * [[PHYSICS LABORATORY|Labs]] * [[|Schedule]] * [[PHYSICS WORKSHOP|Workshop]] * {{:workshops:ph299_syllabus_14sp.pdf|W/S syllabus}} * [[Computational Projects|Projects]] * [[White noise project|White noise]] * [[Rainbow project|Rainbow]] * [[Digital sound project|Digital sound]] * [[Announcements]] ==Earlier material== * [[Chapter 13]] * [[Chapter 14]] * [[Exam 1 review]] * [[Chapter 25]] * [[Chapter 26]] * [[Chapter 27]] * [[Exam 2 review]] * [[Final exam review]] ==Previous wikis== * [[|Ph202 - 2014]] ==Other learning tools== * [[|University D2L site]] * [[|Text & homework]] \\ <sub><color magenta>PH203KUZMASPRING2014</color></sub> ==Knowledge & computation== * [[|Wolfram]] $\alpha$ * [[wp>Physics_portal|Wikipedia]] * [[|Physical constants]] * [[| The Physics Hypertextbook]] * [[| HyperPhysics]] ==Add more by editing:== * [[sidebar|This sidebar]] * [[Tasks to do]] ==Help for editors== * [[doku>wiki:syntax|Help on wiki codes]] * [[|Help on wiki math]] * [[Tips on editing]] =="Sandboxes" for practice== * [[Draft page|Practice here]] * [[Draft page 2|Or here if locked-out]]

A PCRE internal error occured. This might be caused by a faulty plugin

======Digitizing the sound====== ==By Alisha Harrington, Christopher Hernandez, Aaron Heston, and Samia Maurice== ==in collaboration with Prof. Nicholas Kuzma== =====Introduction===== //Feel free to expand and contribute// Sound is what a living organism can perceive through its sense of hearing. [[digital sound project#references|[1]]] Physically, sound is vibrational mechanical energy that propagates through matter as a wave. For humans, hearing is limited to frequencies between about 20 Hz and 20000 Hz, with the upper limit generally decreasing with age. Other species (e.g. dogs) may have a different range of hearing. As a signal perceived by one of the major senses, sound is used by many species for detecting danger, navigation, predation, and communication. In Earth's atmosphere, water, and soil virtually any physical phenomenon, such as fire, rain, wind, surf, or earthquake, produces (and is characterized by) its unique sounds. Many species, such as frogs, birds, marine and terrestrial mammals, have also developed special organs to produce sound. In some species these became highly evolved to produce song (e.g., birds and whales) and (in humans) speech. Furthermore, humans have developed culture and technology (such as music, telephony and radio) that allows them to generate, record, transmit, and broadcast sounds. [[digital sound project#references|[2]]] ====Digital Recording==== [[wp>Sound]] can be [[wp>Digital recording|digitally recorded]] by virtually anyone, as many smart phones and personal computers have this capability. However, most people use these recordings simply to play them back at a later time, and relatively few are concerned with looking at the numerical record itself, analyzing and editing it, etc. Nonetheless, there are entire scientific and engineering fields of digital [[wp>speech recognition]] and [[wp>speech synthesis]]. Most of the progress in these fields stems from advances in mathematical analysis of digitally recorded sound. ===Recordings by authors=== Using [[wp>IGOR Pro]] software, N.K. recorded the following simple vowels and a syllable: "Aaa", "o", "ee", "Ok", as shown in Fig. 1. The following recording parameters were used: | **Table 1.** Recording parameters used in Fig. 1 ||| ^ Parameter ^ Units ^ Value ^ | Hardware used | | iMac | | Recording input | | Built-in microphone | | Software used | | IGOR Pro 6.22A | | Number of channels | | 1 | | Sampling rate | samples per second $\left({\text s}^{-1}\right)$ | 16000 | | {{ :figs:lec02fg3.jpg?nolink |}} | ^ Figure 1. Various spoken sounds recorded by N.K. ^ ====History==== // This timeline is adapted from the [[wp>Digital recording]] article, [[wp>Wikipedia]]// * In 1938, British scientist [[wp>Alec Reeves]] files the first patent describing [[wp>Pulse-code modulation]].((Robertson, David. "Alec Reeves 1902-1971" [[| Telephone History]].)) * In 1943, [[wp>Bell Labs|Bell Telephone Laboratories]] develops the first digital scrambled speech transmission system, [[wp>SIGSALY]]. ((J. V. Boone, J. V., Peterson R. R.: [[|"Sigsaly - The Start of the Digital Revolution"]].)) * In 1957, [[wp>Max Mathews]] of Bell develops the process to digitally [[wp>sound recording|record]] sound via [[wp>computer]]. * In 1967, the first [[wp>digital audio]] magnetic tape recorder is invented by [[wp>NHK]]'s research facilities in Japan. A 12-bit 30 kHz stereo device using a [[wp>compander]] (similar to [[wp>Dbx (noise reduction)|DBX Noise Reduction]]) to extend the dynamic range. * In 1975, [[wp>Thomas Stockham]] makes the first digital audio recordings using standard computer equipment and develops a digital audio recorder of his own design, the first of its kind to be offered commercially (through Stockham's [[wp>Soundstream]] company). * In 1970, James Russell patents the first digital-to-optical recording and playback system, which would later lead to the [[wp>Compact Disc]]. ((Inventor of the Week, [[|Michigan Institute of Technology]].)) * In 1972, [[wp>Denon]] invents the first 8-track [[wp>reel to reel]] digital recorder. * In 1977, Denon's music company [[wp>Denon Records]], a division of [[wp>Nippon Columbia]], became the first record label to record a first of all digitally recorded commercial album using their state-of-the-art "Denon 034 multi-track system". The album was [[wp>Archie Shepp]]'s "On Green Dolphin Street", became the first digitally-recorded album in the history of [[wp>Jazz music]] but didn't include yet the vocals.(([[]].)) * In 1978, Sound 80 Records of Minneapolis records "Flim and the BB's" (S80-DLR-102) directly to digital before pressing the vinyl LP. The mastering engineer is Bob Berglund. The recording system is a 3M Digital Audio Mastering System. * In 1979, the first digital [[wp>Compact Disc]] prototype was created as a compromise between sound quality and size of the medium. * In 1979, the first digitally recorded album of [[wp>popular music]] now with vocals, "[[wp>Bop 'Til You Drop]]" by guitarist [[wp>Ry Cooder]], was released by [[wp>Warner Bros. Records]]. The album was recorded in [[wp>Los Angeles]] on a 32-track digital machine built by the [[wp>3M]] corporation. Also, [[wp>Stevie Wonder]] digitally recorded his [[wp>soundtrack album]], "[[wp>Journey Through the Secret Life of Plants]]", three months after Cooder's album was released, followed by the Grammy-award self-titled [[wp>Christopher Cross (album)|debut album]] of American singer [[wp>Christopher Cross]] which was also 3M digitally recorded album. * In 1982, the first digital [[wp>compact disc]]s are marketed by [[wp>Sony]] and [[wp>Philips]], ((Encyclopædia Britannica: "Compact Disc". 2003 Deluxe Edition CD-ROM. Encyclopædia Britannica, Inc.)) and [[wp>New England Digital]] offers the [[wp>hard disk recorder]] (Sample-to-Disk) option on the [[wp>Synclavier]], the first commercial [[wp>hard disk]] (HDD) recording system.(([[|Synclavier history]].)) Also that same year, [[wp>Peter Gabriel]] releases, [[wp>Security (album)|''Security'']] and "[[wp>The Nightfly]]" released by [[wp>Donald Fagen]], which both were the early full digital recordings. ====Cultural and cinematographic references==== Digital sound recording is mentioned in ... In modern culture, it ... * ... * ... =====Theory===== ====Assumptions==== Digitally recording the sound involves a number of energy transformations, with the corresponding transformations of the oscillation amplitude. The analysis is greatly simplified by making the following assumptions, which are mostly true (except for very loud, very low or very high-pitch sounds): - The source of sound (e.g. vocal cords or a piano string) transforms some of the mechanical vibration energy into an outgoing sound wave * This transformation is assumed to be linear in amplitude * That is, doubling of the original oscillation amplitude doubles the amplitude of pressure deviations in the sound wave * A lot of sources transmit different sound intensity in different directions * Speaking or singing projects more intensity in front of the speaker/singer compared to the direction behind the person * A titled open lid on a concert grand piano is designed to project sound at the audience - The transmission of sound through the air usually distributes the same energy of sound over a wider area * Unless the sound is transmitted through a pipe or an elevator shaft, this leads to attenuation of the intensity with distance * The losses of the sound energy into heat during the transmission through air are usually negligible (in relatively clean, dust-free air) - The sound arriving at the [[wp>microphone]] is the linear superposition of the (variously delayed) sound waves emitted by all the sources * The superposition is the addition of instantaneous pressure deviations from atmospheric due to each wave as a function of time * Before the superposition, each wave is * individually attenuated (according to the distance and the directionality of the source) * individually delayed (by the time it takes to travel from the individual source to the microphone) - The pressure transducer in the microphone produces a voltage signal: $V(t)\sim \Delta P(t)$ * this signal is linearly proportional to the pressure deviations from the atmosphere, due to all inbound sound waves - The [[wp>Analog-to-digital converter|A/D converter]] transforms the continuous voltage signal $V(t)$ to the digital record $V_i(t_i)$ * The time points at which the signal is recorded are discrete evens $t_i=\{0,$ $\Delta t,$ $2\Delta t,$ $3\Delta t,$ $...$ $(N-1)\Delta t\}$ * The time spacing $\Delta t$ between the subsequent recorded numbers is called **dwell time**. * The inverse of dwell time is called the **sampling rate**: $\;f_\text{samp}=\frac{1}{\Delta t}$ * Units: the sampling rate is measured in "samples per second" (${\text s}^{-1}$ or, equivalently, Hz) * The total number of points recorded (including the one at time $t\!=\!0$) is called the **record length** $N$ * The [[wp>Nyquist frequency]] criterion states that the sampling rate must be twice the highest frequency to be recorded: * $f_\text{samp}\geq 2\,f_\text{max}$ * E.g., in audio CDs, the sampling rate is 44100 samples/sec, therefore allowing to record audio frequencies up to 20050 Hz * The **frequency resolution** of the recording (i.e. ability to distinguish two very close tones based on their recordings) is * $\Delta f=\frac{1}{N\Delta t}\;$, where $N\Delta t$ is the **duration** of the recording in time * For example, to distinguish two tones 1 Hz apart, one must record for at least 1 second. * The voltage values that are recorded are also not continuous but discrete: * The possible values are usually of the form $V_i=\Delta V\!\cdot\!v_i$ * where $\Delta V$ is the overall gain scale (measured and recorded in volts per bit (V/bit)) * and $v_i\in \big\{-\!2^{\,n-1},$ $- 2^{\,n-1}\!+\!1,$ $...$ $-1,$ $0,$ $1,$ $2,$ $...$ $2^{\,n-1}\!-\!1 \big\}$ are integers * $\in$ symbol means "with possible values from the following set" * The number $n$ is called the **bit depth**, **number of bits**, or **bit resolution** of the recorder, usually $n=8,$ $12,$ or $16$. * For example, in an 8-bit recorder, the integers $v_i$ can range from $-128$ to $127$ (in total, $2^8\!=\!256$ possible levels) * the most negative recorded voltage is $V_\text{min}=-2^{\,n-1}\Delta V$ * For an 8-bit recorder, this would evaluate to $V_\text{min}=-128\!\cdot\!\Delta V$ * the most positive recorded voltage is $V_\text{max}=\left(2^{\,n-1}\!-1\right)\Delta V$ * For an 8-bit recorder, this would evaluate to $V_\text{max}=127\!\cdot\!\Delta V$ * The smallest voltage difference that can be recorded is $\Delta V$. It is sometimes called **voltage quantization scale** * Substituting the continuous voltage $V(t_i)$ with the nearest possible $\Delta V\!\cdot\!v_i$ can cause some [[wp>Quantization (signal processing)|quantization errors]] for weak sounds - Neglecting quantization errors, the digital recording scales linearly with the pressure deviations of the sound at the microphone * $V_i(t_i)\sim V(T)$ - The discrete frequency analysis of the recording fairly accurately represents the frequency ingredients of the inbound sound * if the necessary conditions are met: * The sampling rate is at least twice the highest possible (or audible) frequency * The duration of the recording is longer than the inverse of the finest frequency difference to be resolved * The typical recorded signal level amplitude is much higher than the voltage quantization scale... but at the same time, * The maximum (and minimum) recorded signal levels are within the $V_\text{min}$ and $V_\text{max}$ bounds (the signal is not "clipped") =====Methods===== ====Computational goals==== The main goal of this project is to characterize the sound recordings (voice or a musical instrument), and to gain insight into the following: - Explore the limitations of digital recordings described in the Theory section above - How loud a sound can one record without clipping? - How quiet a sound can one record without quantization errors? * Simulate quantization errors by rounding the recorded sound to a much coarser voltage grid, -- how does it sound (describe)? - The spectral frequency analysis of sound: - What makes different vowels distinguishable (look at harmonics and overtones) - How does pitch (e.g. singing the same vowel at a different musical tone, low vs. high) affect the spectrum of the recording? - What about differences between the male and the female voice singing the same vowel? - What makes a specific musical instrument sound like it does (look at harmonics and overtones) - compare the spectrum of a note played on a musical instrument to a computer-generated beep (single sine-wave) - compare spectra of different notes played on the same instrument, or the same note played on different instruments - //feel free to suggest more// ====Software==== * [[wp>IGOR Pro]] version 6+ software will be used * A free, fully functional 30-day trial version 6.3 can be downloaded from [[|Wavemetrics]] website. * Installation is pretty standard, either for Mac or for Windows (XP, Vista, or 7 works fine, probably Win8 as well) * The following functionality is needed (''Cmd'' denotes the "apple" or "command" key on a Mac, ''Ctrl'' is the "contol" key on Windows): - Accessing the **command window** to type in commands and see the history of past commands (''Cmd-J'' on Mac, ''Ctrl-J'' on Win) - Accessing the **procedure window** to paste macros and functions if desired (''Cmd-M'' on Mac, ''Ctrl-M'' on Win) - Accessing the **Data Browser** window to see and manipulate the objects (datafolders, waves, variables, and strings) created so far * Available via the "Data" menu (on top of the window), "Data Browser" submenu (see Fig. 2 below) - Creating an array of numbers (or zeroes) for recording and playing sounds. Any array is called a **wave** in IGOR Pro. * In the command window, type (to execute any command you just typed, don't forget to hit "Enter" on the keyboard): <code>Make/N=20000 wave1</code> * This example will create a wave of length $N=20000$, named ''wave1''. It can be seen in the Data Browser in the ''root'' datafolder * IGOR code is **not** case sensitive, all commands and names can be entered in any case or any mixture of UPPER or lower CaSeS. - Setting up the time scale for a given wave * Any wave in IGOR can have "scaling" associated with it, e.g. the time points at which the data values were recorded * The "scaling" is characterized by equally-spaced intervals. * Only three parameters are stored in memory: - The "timing" of the initial point (in our case, $t\,=\,0$) - The timing interval between the successive points (i.e. the dwell time), in our case $\Delta t$ - The units of measurement (in our case, the character string "s" for seconds) * To set the "scaling" starting from $t\,=\,0$, with $\Delta t\,=50\,\mu$s, type into the command window: <code>Setscale/P x 0, 50e-6,"s",wave1</code> * Here ''/P'' denotes the "Start and delta" format * ''x'' is the x-scaling (the default, multidimensional waves can also have y, z, etc.). Therefore, in our case time is "x". * $50\times 10^{-6}$ can be entered as ''0.00005'', ''5E-5'', or ''50e-6'' -- all these forms are equivalent. * Alternatively (and equivalently), a sampling rate of 20000 samples per second can be set via <code>Setscale/P x 0, 1/20000,"s",wave1</code> * The above line works because $\Delta t=\frac{1}{f_\text{samp}}$ * The scaling can also be checked and set via a "Change Wave Scaling..." submenu in the "Data" menu (Fig. 2) - Assigning a pure sine wave data to an IGOR wave. * In the command window, type <code>wave1=0.5*sin(2*pi*1500*x)</code> * This will assign the following data to the wave: * $V(t)=0.5\sin\big(2\pi\!\cdot\!(1500\,{\text{Hz}})\!\cdot\!t\big)$ at the time points specified by the scaling * Note that ''*'' must be used for multiplication all the time * ''x'' in the expression refers to the x-scaling (default) set by the ''SetScale'' command. In our case, it denotes the time points $t_i$ * <color red>Warning</color>: This command will erase any data previously stored in ''wave1'' * Use ''Duplicate wave1 anotherwave'' to store all the ''wave1'' data (including the scaling) in ''anotherwave'' - Playing the sound recording from a wave: * In the command window, type: <code>playsound wave1</code> * If no sound is audible, please make sure your computer volume is not muted or set to zero (i.e. you can hear a [[|YouTube]] video) * If there is still no sound (on Windows), try to increase the amplitude of your sound wave: <code>wave1=2000*sin(2*pi*1500*x) playsound wave1</code> - Displaying the wave as a time plot * In the command window, type <code>display wave1</code> * Or, build a new graph via a "Windows" menu, "New Graph..." submenu. * Choose ''_calculated_'' for the x-wave (this will utilize the "scaling") * To zoom in on the plot, drag a box (called **marquee**) across the area you want to zoom in/out with the left button using a mouse. * Release the mouse button, then click inside the marquee to select the zoom option * To zoom out to the original, all-inclusive view, simply press ''Cmd-A'' on Mac, or ''Ctrl-A'' on Windows * Change the appearance by right-clicking or double-clicking on various components (margins, axes, traces, labels, grid, etc.) - Recording the sound... follow these steps: - Type in the command window, hitting the "Enter" key after each line:<code>SoundInStatus edit W_SoundInRates print V_SoundInSampSize</code> - The first command gets IGOR to ask your computer's sound system for possible sampling rates and other parameters * The possible sampling rates are stored in an automatically created ''W_SoundInRates'' wave - The second command simply displays this wave as an editable table * The first value is the number of possible values (usually 5 or 7), can be visible in the top left corner of Fig. 2 * After that all the possibilities are listed in units of ${\text s}^{-1}$. - The third command prints the automatically created variable V_SoundInSampSize in the command window's history area * On most windows laptops this value is 3: * it means, the sound card can only record into an array (wave) of 8-bit or 16-bit integers. * On a typical Mac, this value is 10: * it means the sound card can also record into an array (wave) of real (floating-point) numbers - Create an empty wave and set the sampling rate to one of the possibilities (using the ''SetScale'' command): <code>Make/O/W/N=131072 wave2 FindLevel/Q/R=[1,numpnts(W_SoundInRates)-1] W_SoundInRates, 40000 Variable/G bestrate=W_SoundInRates[min(W_SoundInRates[0],SelectNumber(V_Flag,round(V_LevelX),2))] Setscale/P x 0, 1/bestrate,"s",wave2</code> - Here, the first line created a brand-new wave ''wave2'' filled with zeroes and having a default scaling (no units, start 0, step 1) * The ''/O'' option enables overwriting any previous data stored in ''wave2'' without producing an error * The ''/W'' option makes this a 16-bit integer wave (with possible values limited to integers from $-32768$ to $32767$) * On a Mac, if the value of ''V_SoundInSampSize'' is greater than 3, you can get a better-quality recording by not using ''/W'' * The number of points is $N=131072=2^{17}$: chosing a power of 2 turns Fourier transform into a Fast Fourier Transform - The second line searches the possible sampling rates in ''W_SoundInRates'': * ''/Q'' is the "quiet" mode, prevents printing the result in history and will not give an error if the value is not found * ''/R=[1,numpnts(W_SoundInRates)-1]'': * the search begins at point 1 (the 2<sup>nd</sup> value in the table showing ''W_SoundInRates'', since IGOR counts indices from 0) * ends at the last point (= # of points $-1$, because the points are counted starting from 0) * looking for a sampling rate closest to 40000 samples per second, so that * the upper limit of the human hearing range is close to the Nyquist limit of $\frac{1}{2}f_\text{samp}=20\,$kHz * if the search doesn't find anything, the variable ''V_Flag'' is set to 1 * if the result is found, ''V_flag'' is set to 0, and ''V_LevelX'' is set to the number close to the index (x-scaling) of the result - The 3<sup>rd</sup> command sets a new ''bestrate'' variable to one of the possible sampling rates depending on the following: * If ''V_flag'' is 1 (no sampling rate close to 40000 found), the third number in ''W_SoundInRates'' is picked (index=2) * If ''V_flag'' is 0 (sampling rate close to 40000 was found), the index obtained in the search is rounded to the nearest integer * the ''min(,)'' ensures that the index does not exceed the # of choises stored in the zeroth element of ''W_SoundInRates'' * the ''W_SoundInRates[...]'' selects the sampling rate from the ''W_SoundInRates'' wave stored in the calculated index'th element - The fourth command sets the wave scaling to the inverse of the sampling rate stored in ''bestrate'' variable - In the command window, type <code>SoundInRecord wave2</code> * <color red>Warning</color>: This command will erase any data previously stored in ''wave2'' * Use ''Duplicate wave2 anotherwave'' to store all the ''wave2'' data (including the scaling) in ''anotherwave'', to preserve it * If an error comes up, check the sampling rate of ''wave2'' again: <code>print 1/deltax(wave2)</code> It should match one of the choices in ''W_SoundInRates'' exactly * If an error comes up on a Mac, try to rerun all the commands in step 9.b without the ''/W'' in the ''Make'' command:<code>Make/O/N=131072 wave2 FindLevel/Q/R=[1,numpnts(W_SoundInRates)-1] W_SoundInRates, 40000 Variable/G bestrate=W_SoundInRates[min(W_SoundInRates[0],SelectNumber(V_Flag,round(V_LevelX),2))] Setscale/P x 0, 1/bestrate,"s",wave2</code> * The length of the recording is determined by the number of points in your wave (the record length $N$) and the sampling rate $f_\text{samp}$ - to make sure the recording is successful, play it back (see above). - Calculating and displaying the power spectrum of the sound as a function of frequency * In the command window, type <code>FFT/MAGS /DEST=wave2_fft wave2; display wave2_fft; ModifyGraph log(left)=2</code> * This is an example of executing 3 different IGOR commands on the same line: * The commands in this case need to be separated by a semicolon '';'' - The first command * computes the squared magnitude (power) of the [[wp>discrete-time Fourier transform]] of ''wave2'' as a function of frequency * saves it as a ''wave2_fft'' wave * note that the scaling of the ''wave1_fft'' wave is automatically in units of frequency (Hz) with the correct spacing $\Delta f$ * <color red>Warning</color>: This command will erase any data previously stored in ''wave2_fft'' * Use ''Duplicate wave2_fft anotherfft'' to store all the ''wave2_fft'' data (including the scaling) in ''anotherfft'' - The second command creates a new default-style graph of the power spectrum contained in ''wave2_fft'' - The third command changes the type of the vertical axis of the graph to log-type, so the weaker harmonics and overtones can be seen - Saving your work * Save the entire IGOR "experiment" (including the command history, code, data, plots, tables, variables and strings, everything) * Use the "File" menu, "Save Experiment As..." submenu * Make sure to select "Packed Experiment File" option to save everything as a single .pxp file. * Open the saved file to get back to where you finished last time * Access your file either from the Mac finder or Windows explorer, or from IGOR's "File" menu, "Recent Experiments" submenu - Automation * If you find yourself running the same commands over and over (trying to record something and plot the spectrum, for example) * It might be the right time to automate your tasks: - First, copy the commands you tend to repeat from the history area of the Command window - Go to the Procedure window (''Ctrl-M'' or ''Cmd-M'') and create an empty macro shell there, by typing below the ''#pragma'' line:<code>Macro MyMacro() End</code> * Here, ''MyMacro'' is the name of your macro, it can be any word without spaces or punctuation characters * ''()'' indicates that you do not have any input variables or strings - Paste the often-used commands between the ''MyMacro()'' and ''End'' lines, make sure each command is on a separate line. - Compile the code by clicking "compile" at the bottom of the Procedure window or by switching to a Command window - Go back to the Command window (''Ctrl-J'' or ''Cmd-J'') and run your macro by typing <code>MyMacro()</code> Or, you can run your macro from the "Macros" menu | {{ :projects:sound:databrowser.png?nolink |}} | ^ Figure 2. Illustration of how to access the Data Browser window (shown on the right side) in IGOR Pro. ^ ====Coding tasks==== This is the detailed list of tasks to be accomplished: - Record, display, and replay either voice or musical instrument sounds (or any sound source you are interested in) - Try to simulate recordings that violate some of the assumptions in theory (clipped, over-quantized, under-digitized in time, too short) * How does each artifact sound? Can you tell if you hear one in a recording? - Compute the power spectrum of each recording, and relate your observations of various peaks - Try to correlate power spectra with the tone of the sound, volume, possible artifacts, type of vowel or musical instrument =====Data===== //Feel free to post links to your data here// ====Questions about data==== - {{ :projects:wave_2_graph.png?200|}} I have been following the directions for the digital sound project for a Windows computer. This graph of a wave was generated (see figure on the right); however, I haven't recorded anything. In fact, I haven't figured out how to record anything at all like the examples on this wiki page. I am hoping someone can tell me how to proceed. * **Answer**: What makes you think you haven't recorded anything? I think you did! I can see some noises and a loud bang in the beginning. It looks like a recording. If you post your saved IGOR experiment (assuming the file is not more than a few MB - it shouldn't be with just one recording), we can listen to the sound. * Probably you haven't made any sound while recording: why not repeat the recording step again, but this time, * as soon as you hit "Enter" on the ''SoundInRecord wave2'' command, start singing a note, like "Eeeeeeeeeeee", for about 3 seconds. * No need to plot again, - the existing plot of wave2 will update automatically. * Also, try playing back your recording by entering ''playsound wave2'' into command window. Do you hear your "Eeeeeeee"? - Okay,I did record a sound. I would like to post it so we can hear it and you can give me suggestions for experimenting with it or with additional sounds. However, when I tried to upload the file I received this message " Upload denied. This file extension is forbidden!" * **Answer**: Good job! Sorry I didn't realize you cannot upload igor files to the wiki. One workaround is to change the extension from pxp to txt, that might work - I am not sure. You can also email me the pxp file as an attachment. * I think you are on the right track. You should learn how to zoom in and out on the plot in Igor (by dragging a box with the left button of a mouse and then clicking inside the box) - do it repeatedly until you see the individual oscillations (See figure below). You can zoom all the way out by pressing ''Ctrl-A''. Look at the fine details of your plots. ''Wave2'' sounds like a piano note - do you have an electric piano or a piano app on your phone? I can also hear a dog barking and people talking outside - all this information is contained in the column of numbers that you recorded - I wonder which digits contain which sound. {{ :projects:sound:clipping.png?nolink |}} * Anyways, if you zoom in on the beginning part of wave2, as shown here on the second plot of the same ''wave2'', you will see that the amplitude is unnaturally constant for a while (before it starts dropping off), and the oscillations look more like square waves rather than sine waves. This tells me that your sound was too loud for the range of your recorder - that is, the actual sound was decaying in amplitude all along from the start, but the recorder was pegged at the maximum of what it can record (that is, "clipping your sound") until the sound decayed enough to "fit" into the range of the recorder. Does it make sense? Try to repeat the same recording, but either make the sound quieter, or move the laptop away from the source so that there is some attenuation (or even do what humans do - cover your microphone (if you know which hole it is on the laptop) with a cloth or a towel.) * Another tool (besides zooming in on the time course of the signal) is to look at the spectrum - this wiki page explains how to do this. For your beautiful piano-like tone, this should give a very nice thin line in the spectrum, if you are not clipping with your recorder. If you are clipping, like in ''wave2'' now, the spectrum will have extra stuff - the musicians can tell it's clipping because it sounds more "metallic" or even "buzzing". You can dedicate the entire project to the effect of clipping - compare clipped and not clipped sound (the perceived impression, the time plots, and the spectra). - I have collected some initial data and recordings. In my file: * ''Wave2'' is a medium volume sound directly adjacent to the microphone. * Wave 2 obviously translates poorly and exhibits clipping. * ''Wave3'' is a soft volume sound directly adjacent to the microphone. * Wave 3 sounds ok despite its close proximity to the microphone. * ''Wave4'' is a sound of varying volume at typical lapotop distance (my final data will have a constant pitch, this was just for my own * But I have graphed all three spectra according to your directions, and they look nearly identical. What do I need to do to discern meaningful data from these graphs? * **Answer**: To compare the spectra directly, plot them on the same graph by <code>display wave2_fft,wave3_fft,wave4_fft; ModifyGraph log(left)=1,rgb(wave3_fft)=(0,0,65535),rgb(wave4_fft)=(0,0,0); SetAxis bottom 0,4000</code> * Actually, somehow all your fourier transforms are the same! That's why they look the same! They ended up being identical. * Do them again: <code>FFT/MAGS /DEST=wave2_fft wave2; FFT/MAGS /DEST=wave3_fft wave3; FFT/MAGS /DEST=wave4_fft wave4;</code> *Look at the most recent plot you just plotted. Now you see differences. * Also makes sense to plot the time-traces on the same plot: <code>display wave2,wave3,wave4; ModifyGraph rgb(wave2)=(65535,40000,40000),rgb(wave3)=(0,0,65535),rgb(wave4)=(0,0,0); SetAxis bottom 0,.04</code> * This highlights the striking differences in intensity. - The other thing is I have been looking at Fourier series calculations and have seen the use of integrals when looking at the Fourier coefficients. Should I be concerned about this? * **Answer**: there are two kinds of Fourier transforms: [[wp>Fourier transform|continuous]] (liked by theorists) and [[wp>Discrete Fourier transform|discrete]] (used in numerical computations). The two kinds are closely related, and in the limit of small grid spacing (small dwell time $\Delta t$) they approximately give the same results. But, the continuous transform uses integrals, whereas the discrete one uses simple sums. You don't need to worry about this too much, just use it as a tool that already works. When you write about it, just say that it decomposes the time-dependent signal into a linear superposition of pure sine and cosine waves at various frequencies on a frequency grid, and gives you the combined amplitude at each frequency, as a function of the point on the frequency grid. - Is there a way to copy and paste the plots from Igor pro into a document or is there another way to display the plots in a document? * **Answer**: You can use * "screenshots" (will work for any screen content, not just IGOR plots) * explained in [[white_noise_project#software_and_data_analysis|the white noise project, Software and data analysis, Item 10]] <- click on the green link * Or, in IGOR, after clicking on your graph, go to the ''File'' menu, ''Save Graphics...'' submenu, and select * the format (PNG or JPEG_ * resolution (Other DPI, then 300) * file name -- the name of the picture file that IGOR will create for you * path (home -- will save in the same directory on your computer where you have saved your .pxp experiment * force overwrite -- if you already have that picture file and want to improve it * Then just insert the picture file into your report as picture - Next question <-- <color red>//ask your questions here//</color> ====Tips and suggestions==== - You can simulate clipping by using the following commands (assuming you want to clip ''wave2'' that already exists): * the function ''=min(...,...)'' returns the smaller of the two arguments * the function ''=max(...,...)'' returns the larger of the two arguments * before you start, you need to set up your ''low'' and ''high'' variables fist: <code>Wavestats/Q wave2; print V_min,V_max // this line finds the min and max of wave2 Variable/G clip_perc=80 // this creates a variable for storing the clipping percentage Variable/G low=clip_perc*V_min/100,high=clip_perc*V_max/100 // sets low and high thresholds for clipping</code> * Here the ''%%//%%'' is the "comment" symbol, it tells IGOR to ignore the remainder of the line, allowing you to comment your code * now create another wave and combine the ''min()'' and ''max()'' as ''max(low,min(high,...))'': <code>Duplicate/O wave2,wave2_clipped // create an identical wave2_clipped first wave2_clipped=max(low,min(high,wave2)) // now do the clipping</code> to limit the values in wave2_clipped to be within the [low,high] range * you can achieve different levels of clipping by setting ''clip_perc'' closer to 0 (more severe clipping) or closer to 100 (more gentle) * you can display the clipped wave as a plot, calculate and display its spectrum, and play it to listen how it sounds =====References and Footnotes===== ====Cited references==== - Strutt ([[wp>John William Strutt, 3rd Baron Rayleigh|Rayleigh]]), J W; Lindsay, R B (1877). The Theory of Sound. Dover Publications. ISBN 0-4866-0292-3. - From N.K.'s past contributions to the [[wp>Sound]] article on [[wp>Wikipedia]]. ====Footnotes====

digital_sound_project.txt · Last modified: 2014/06/04 03:53 by wikimanager