NOTE: this documentation is by no means complete at this point. Please treat it as a work in progress. All of the analysis/processing techniques available in mxv are well-known to those who work with computer-generated sounds. Manual pages and other specific information about each of them would overly expand the size of this document, so they are only described briefly. More information is available elsewhere for those who wish to learn more. MiXViews (mxv) Introduction MiXViews is an editing, processing, and analyzing tool for digitized sounds and other forms of binary data. It is built upon the InterViews X library, and runs within the X window environment. ------------------------------------------------------------------------------- Startup Arguments MiXViews has three different types of command-line arguments: 1) InterViews arguments These include all standard X application arguments like -fn, -rv, etc. 2) MiXViews arguments Currently just two: -autoplace, which causes the windows to map without user input -plotwidth, which determines the width in pixels of the data display 3) File arguments: There are two of these, and they must immediately follow each file to which they will apply: -skip N, where 'N' is the number of seconds to skip before reading. -duration M, where 'M' is the duration in seconds to read in the file. -skip defaults to 0, and -duration defaults to the entire file length. Example: "mxv 1st.snd -skip 3.5 2nd.snd -duration 5.0" will skip 3.5 seconds into 1st.snd, and read from there to the end, and will read the first 5 seconds of 2nd.snd. ------------------------------------------------------------------------------- General Paradigm Mxv is based on the MVC (Model-View-Controller) paradigm of object-oriented programming. What this means is that every chunk of data being edited, whether it be a digital sound, an amplitude envelope, a linear-predictive coding file, or whatever, is represented in the program as a type of object called a model. The user (this means you) interact with this model via another object called a view. The view lets you "see" the model, that is, it displays the model's data in some format that makes sense of the values, usually as a graph of some sort. Any given model can have any number of views associated with it. All of these are related in that they are providing "windows" into the data that is being edited. There will always be at least one view; when the last open view is closed, the model will be destroyed, usually after having been saved to disk. The third part of the MVC paradigm, the controller, is an object that coordinates the communication between the model and its view(s), and is not visible to the user. ------------------------------------------------------------------------------- Files vs. Views Because mxv stores all the data being edited in virtual memory (rather than continuously reading and writing to disk), each view or window does not necessarily have a corresponding file on disk. When any on-disk file is opened in mxv, it is displayed as a single view with the file's name on the title bar. However, many windows may exist which have never been written to disk (and which have no associated file), or multiple views of a single file may be visible. Between the time a file is read by mxv and the time any individual model is written out to a file, it is better to think in terms of data objects rather than files. ------------------------------------------------------------------------------- Types of Data The most common form of data file available for editing is the soundfile. Mxv is able to read soundfiles with IRCAM-style, native NeXT, Sun (au), AIFF-C, or WAVE format headers. Soundfiles can have 1, 2, or 4 channels, and can have sample formats of 8-bit linear, 8-bit mulaw, 16-bit linear, or floating point linear. Compressed data formats (other than mulaw) are not readable except on the Silicon Graphics platform. Other types of data that may be edited are as follows: Name File Suffix Description LPC .lpc Analysis data from Linear Predictive Coding Pitch .pt Pitch Track analysis data Envelope .evp General purpose data curves, stored as doubles FFT .fft "Time Slice" Fast Fourier Transform analysis data Pvoc .i or .pv Phase Vocoder Analysis data All files with no suffixes, or other suffixes, will be read (or attempted to be read) as soundfiles. If the "read raw file" option is set in the resource file or in the options menu command, mxv will attempt to read files without headers, based upon the information you supply, and based on the file suffix (i.e., a raw file with a .lpc suffix will be interpreted as a raw LPC data file). ------------------------------------------------------------------------------- Targets and Sources Many operations upon data objects involve taking a portion of one and putting it into another. Sometimes it will be spliced in, sometimes mixed in, or some other method of combining. In this document, the object being taken from is called the source, and the object being added to is the target. The general procedure for this in mxv is as follows: Using the mouse, the user highlights some portion of a view (displays it in reverse video), which is called selecting it. The technique for doing this is explained later. The user then picks some other view, and again using the mouse, sets an edit point in that view. If the user then performs a "splice in" operation in this window, this window will be the target of the operation, and the previously selected window will be the source. If an operation (e.g. scaling) involves only a single view, then that view is the target, and there is no source. In these cases, the target is simply called the selection. For example, if the user wished to scale a section of a sound, he/she would highlight (select) a portion of the visible sound, and then choose "phrase" from the available modifying commands (see below). The operation will take place upon the current selection only. If the target and source formats differ in their maximum amplitude capability (for example, mixing a floating point sound into a short integer sound), the source material will be scaled to match the target in the following way: for formats with fixed maximum amplitudes, the ratio of signal amp to maximum amp will be preserved. Otherwise (for floating point and double-precision), the ratio of signal amp to max amp for the file will be mapped -- for instance, a source selection with an amp of 100,000 from a file with a max of 300,000 will be scaled to .30 of the target file's maximum amplitude. ------------------------------------------------------------------------------- Adjusting the View The data (waveform) view window has a horizontal scroll bar which contains a button that indicated the portion of the data currently visible. Each of the three mouse buttons has a different effect when clicked in the scrollbar: Left: Scroll 1/4 page in the direction of the click. Middle: Scroll 1 page in the direction of the click. Right: Scroll to the location in the file indicated by the click. The arrows on either end will scroll 1/4 page, or 1 page if shift-clicked. The view resolution is determined by the horizontal and vertical zoom buttons. The resolution may also be set directly via the view menu commands (see below). In addition, the arrow keys can be used in combination with the control key to adjust the resolution: Up Arrow: Vertical zoom out Down Arrow: Vertical zoom in Right Arrow: Horizontal zoom out Left Arrow: Horizontal zoom in Note: Due to window manager interactions, the arrow keys may not function correctly. ------------------------------------------------------------------------------- Selecting Edit Points The fastest way to select editing points is with the mouse. In all data display windows, the mouse buttons behave as follows: Left Mouse: set insert point (or beginning of edit) at current location Middle Mouse: set end of edit region at current location Right Mouse: select entire file for editing The key modify these as follows: Control-Left Mouse: continuously update insert time and duration display, and display amplitude for current frame Control-Middle Mouse: continuously update selection time and duration display Control-Right Mouse: select visible portion of data for editing If the key is held down with any of the described combinations, only the channel that is under the cursor will be selected. Without it (the default), all channels will be selected. When the mouse button is released, the currently current selection will be highlighted, and the numerical values will be displayed in the panel below and to the right of the data display panel (regardless of whether the key is down or not). Edit points may also be set via the Set Insert Point and Set Edit Region commands (see below). ------------------------------------------------------------------------------- Shifting Edit Points A previously selected insert point or region may be shifted to the right or left by one unit using the 'l' and 'h' keys, respectively. This is intended to parallel the 'vi' editor commands. A "unit" is defined as one horizontal pixel or one data frame, whichever is wider. In addition, a selected (highlighted) region may be expanded or shrunk in size by one unit using the '+' and '-' keys (do not use the key with the '+'). Also, the selected region may be "collapsed" into an insert point set to either the beginning or the end of the region, using the key or the space bar, respectively. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Mxv Menus Different menus are available depending on the type of data being edited, but all types have the view, file, edit, display, and options menus. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- View Menu The view menu contains all commands relating to the visual display of the data. ------------------------------------------------------------------------------- new view of selection Open new window (new view of current model) with the display set to show the current selection. zoom to selection Expand currently current selection to fill the entire window. zoom to full Set display to view the entire model. set frame view... Prompt for desired horizontal frame or sample display. reset vertical scale Return the vertical scaling to its default (starting) value. View Mode -> Popup submenu with commands for switching the display between channel mode (successive channels of the data displayed in stacked graphs) or frame mode (cascade display of amplitudes of all channels as a function of time). This latter mode may only be used on files with at least 4 channels. Channel Display -> Pop up submenu with channel display commands (see below). display copy buffer Open new window displaying the internal copy buffer, if there is anything in it. This buffer is filled via the copy command, and by the remove and splice out commands. close current view Close this window. If this is the last remaining view of a model, mxv will warn you if there are unsaved changes to this model. show program version Display current version number and copyright. quit program Close all views of all models and exit program. Mxv will warn you about any unsaved models. ------------------------------------------------------------------------------- Channel View Submenu In channel view windows, these commands affect the number of graphs that are stacked and visible in the view. In frame view mode, these commands affect the horizontal display, because each visible horizontal plot represents all channels for a given frame. ------------------------------------------------------------------------------- set channel view... Prompt for specific set of channels to display. add channel Add next available channel to the displayed channels. If there are no additional channels, the command has no effect. remove channel Remove highest displayed channel from the display. No effect when only one channel is displayed. shift up/down Shift channel display up or down by one increment. For ex., if channels 0 and 1 were displayed, choosing shift up would display channels 1 and 2. Causes horizontal shift in cascade displays. No effect if no additional channels available. ------------------------------------------------------------------------------- File Menu ------------------------------------------------------------------------------- set default dir... allows user to enter a directory name to be used as the default directory for opening/saving the current data type. new... allows user to create an empty file in memory (not written to disk) for scratch use. Appropriate information for each data type may be set. New Type -> pull-out menu for creating a new file of selected type (as opposed to new... which always creates the same type as the current window). open... pops up an open panel to let user choose another file to open. The default directory to open can be set for each data type via either the set default dir command (see above) or the X resource file (see below). User may enter an amount of time (measured in seconds) to skip into the file before reading, and/or a duration of time to read in the file. The default is always 0 inskip, total duration. If a file is opened as a segment, i.e., nonzero inskip and/or partial duration, it will be displayed in mxv with a new name constructed from the string "tmp_" plus the skip and duration values, followed by the name of the file. This avoids the possibility of accidentally overwriting the original file with the segment. If a raw (headerless) file is opened, a dialog panel will be displayed asking the user for information about this file. Mxv has no way of checking the validity of this information, so be careful. Users can specify arbitrary-sized amounts of data to be skipped (for unknown header types, for instance) prior to reading data. This value is independent of any inskip time that might have been specified in the open panel. Users may also specify whether the byte order of this raw data needs to be swapped. save... (re)write file to disk. If file is tmp or untitled file, user will be prompted for a name via the "save as" command. File will always be written out with the same header format as the file on disk. save as... write file out with a new name (changes name of current file to match). The user may choose the type of header (if any) used to write a new data file. If overwriting an existing file, the user may choose to force a new header type or preserve the existing one. revert re-read file from disk, undoing all changes since the last save to disk. File must have been saved to disk at least once for this to work (i.e., temp files and untitled files cannot be reverted). change name... prompts user for new name for current file. The user will be warned if the new name matches an existing name in the default directory for the particular data type. change file comment... allows user to edit a text comment for data files. This is stored as part of the file header on disk. A text editor window will be displayed in which the current comment (if any) will be displayed. After editing, the changes must be saved by selecting "save comment" in the Edit menu. file information... displays a window with all appropriate data about the file, including name, sample rate, length, file size in Megabytes, and number of channels. ------------------------------------------------------------------------------- Edit Menu ------------------------------------------------------------------------------- set insert point... prompts user for exact location of desired insert point. This bypasses the "quantization" of the insert point due to the current horizontal display resolution. set edit region... prompts user for exact location of desired selection. Also bypasses quantization as in previous command. copy copies the current selection into the global copy buffer. copy to new copies the current selection into a new temporary file remove copies the current selection into the buffer and then erases (zeroes) the region. remove to new does a copy-to-new followed by an erase. erase erases (zeroes) region WITHOUT storing it in buffer. This cannot be undone. splice out copies the current selection into the global buffer and then splices it out of the original file, i.e., shortens the file by the length of the region and moves the remainder to the left. Note: Mxv will not allow splicing out of the entire file. Minimum file length is at least two frames. splice out to new does a copy-to-new followed by a splice out. delete splices out the region WITHOUT storing it. This cannot be undone. mix adds the contents of the global buffer or current source, sample by sample, to the file beginning at the insertion point. replace replaces (destructively writes) the contents of the global buffer or current source to the file beginning at the insertion point. crossfade... allows user to combine two segments with a crossfaded overlap zone. The user may choose a curve from a set of predefined choices, or read one from a .evp file on disk. splice in inserts the contents of the global buffer or current source at the insertion point, shifting the contents to the right. Opposite of splice out. ------------------------------------------------------------------------------- Modify Menu (for Sounds) ------------------------------------------------------------------------------- phrase... multiplies the current selection by an amplitude factor. apply envelope... multiplies the amplitude values of the selected region by selected source envelope. If no source is currently selected, program will prompt for one. reverse reverses the current selection in time. transpose... transposes the current selection by either an equal-tempered or linear-octave interval, or by any frequency ratio. The result will be displayed in a new window -- the original file is unchanged. time shift... not yet implemented Filter -> pull-out Filter submenu (see below) insert space... allows the user to splice in any number of seconds worth of silence, starting at the insert point. normalize values scales amplitude values of current selection between 1.0 and -1.0. This is useful for converting sounds into control envelopes. The user will be warned if attempting this on a non-floating-point data type (which would turn the data into random 1's, 0's and -1's). add delay... shifts selection to the right by any number of frames add DC offset... adds a fixed offset to every sample remove DC offset applies a sharp filter to remove all frequencies below 20 hz. Useful for removing LF rumble, etc. ------------------------------------------------------------------------------- Filter submenu (Sounds only) ------------------------------------------------------------------------------- low pass current selection is filtered by a first-order (one pole) lowpass filter. resonant current selection is processed by a second-order (two pole) resonant filter. Center frequency, bandwidth ("Q"), and gain-mode can be set. comb current selection is filtered by a comb filter. Value can be set as either center frequency in hz. or as a loop (delay) time in seconds. The "Q" of the filter is measured as the time it takes the impulse response to fall by 60 db. NOTE: This routine, like the transposer, puts its output into a new window. elliptical current selection is processed by a variable-length elliptical filter, which is capable of very sharp rolloff curves, and may function as a high, low, or bandpass filter with variable stopband ripple and attenuation. The passband cutoff is the frequency at which the rolloff will begin. The stopband cutoff is the frequency at which the amplitude has been reduced by the value given in the stopband attenuation field. Therefor, if the stopband is greater than the passband, a low-pass filter will be created, otherwise a high-pass filter will be created. The sharpness of this filter is infinitely variable through changes in the passband, stopband, and attenuation. If the bandpass stopband is nonzero, it will be used as the max attenuation point for the side of a bandpass filter not set by the stopband. The ripple factor determines the amplitude of the sidebands which are an unavoidable part of any filter. Smaller values produce bigger and slower filters. LPC formant use a previously created Linear Predictive Coding datafile to apply a time-varying format filter to the current selection. A portion of an open LPC datafile must be selected first, followed by the selection of a region in a sound to be filtered. Note: This filter often produces large amplitudes, and is best performed upon a floating-point soundfile. The user has a choice of LPC frame interpolation methods; linear is faster, but recalculated produces smoother results, and is useful when stretching a small number of LPC frames over a large amount of sound. A warp factor may be specified to shift the formant peaks up or down. See Analysis menu, below, and Appendix B for more information about LPC data. ------------------------------------------------------------------------------- Modify Menu (Other data types) ------------------------------------------------------------------------------- normalize values same as above smooth curve applies a fixed lowpass filter to data -- f(x) = .5x + .5(x-1) scale values... same as "phrase", above rescale to fit... allows the user to scale selected region to fit between an arbitrary maximum and minimum value add offset... same as "add DC offset", above apply envelope... same as above reverse same as above insert space... same as above add delay... same as above stretch/shrink like "transpose", above, but specified as ratio or as new length, rather than interval ------------------------------------------------------------------------------- Display Menu (channel view) ------------------------------------------------------------------------------- Graph Style -> waveform graph may be displayed in either a continuous line graph, or a solid bar graph. The bar graph is useful for displaying individual samples for editing purposes. Horiz Scale Units -> horizontal scale for waveform may be shown in either time (based on frame rate), frame numbers (sample numbers for sounds), or SMPTE frames (see "Scale Options" in options menu, below). ------------------------------------------------------------------------------- Display Menu (frame view) ------------------------------------------------------------------------------- Horiz. Scale Units -> cascade-style frame display's horizontal axis may be displayed in either frequency or band (channel) numbers. Vert. Scale Units -> vertical (y) axis, which is always amplitude, may be displayed in either linear (amplitude) mode or log (decibel) mode. This is especially useful for examining fft and phase vocoder data. ------------------------------------------------------------------------------- Options Menu Note: Besides the commands listed below, each data type has its own options menu command in its data-specific menu (i.e., under the "Sound" menu for sounds). ------------------------------------------------------------------------------- Peak Rescan -> Displays a submenu allowing user to disable the normal peak rescan that occurs after every editing operation. This will speed up editing significantly, but will not update the vertical scale on the screen. Global Options... Allows user to set global program options. Many of these options are also settable via a .Xdefaults file (see mxv X resources, below). Alert Beep Volume: Set the beep level relative to the base level set for the keyboard (see the xset man page). Note: The beep cannot be shut off here -- use "xset b off" to do this. Dialog Panels Ignore Window Manager: See Appendix D for more information. Auto-Place Windows on Screen: For window managers which offer a choice, windows may be placed automatically in a staggered position on screen. Scale Options... Allows user to set options regarding vertical and horizontal scales. Currently, the only settable option is the SMPTE frame format, in which the user can choose any of the standard SMPTE frame rates. Note: This feature is still in development. File Options... Allows uset to set options regarding files and directories. Some of these options are also settable via a .Xdefaults file: Read Raw (Headerless) Files: If this option is on, mxv will attempt to read files without headers, based upon the information you supply, and based on the file suffix (i.e., a raw file with a .lpc suffix will be interpreted as a raw LPC data file). It will also attempt to reread all unreadable (i.e., empty or without read permission) files as raw files. See the open... command, above, for further information. Store/Recall Browser Path: If this is set to YES, each subsequent open command will return you to whatever was the last directory from which a file was read, as opposed to returning you to the default directory for that data type. Memory Options... For more complete control over the amount of memory used by mxv to store and edit data, a limit may be arbitrarily imposed on the size of any single allocation (for example, when a new file is opened or created in memory) and on the total amount of memory used by mxv for all open datafiles. If the single or total allocation exceeds its associated limit, an alert panel pops up informing you of this, and giving you a choice of whether to continue or not. Default values are: 200mB for total allocation, 100mB for single allocations. NOTE: These limits are not independent of the machine's own memory limits. Real memory allocation errors will still occur when the machine's virtual memory limit is reached if that limit is less than the one set in mxv. As much as possible, mxv is designed to recover gracefully from all such allocation failures. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Special Menus Many datatypes have specific menus for editing/processing commands only available for that data type. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Sound menu (Sounds only) ------------------------------------------------------------------------------- play play the current selection through the machine's digital-to-analog converters (if any) record record sound into currently selected region. This assumes that the sound type is compatible with the converter in use stop halt the current play or record operation, if any (See Playing and Recording Sounds, below) rescan for peak scan the soundfile to determine the current minimum and maximum values. rescale sets the peak amplitude of the file to +-32767 (for mu-law & 16-bit samps), +-1.0 (for float samps), and +-127 (actually 0 - 255) for 8-bit samps. change sample format... allows user to convert file from one format to another, i.e., floating point to short int or short int to 8-bit mu-law. File name will have a new suffix appended to show it has been converted. change sample rate... allows user to change the sampling rate of the soundfile. change file length... change the overall length of the current file. User will be warned if reducing the length (which will destroy data). Synthesis -> pull-out Synthesis submenu (see below) D/A Converter -> pull-out Converter submenu (see below) sound options... set the default format for sounds, including sampling rate, sample format, and header format. These defaults will be used in the "Save to" panel and the "New Sound" panel. ------------------------------------------------------------------------------- Synthesis Submenu (Sounds only) All the synthesis techniques require a "source" selection to be made (this will be the data controlling the synthesis) and a "target" selection (this will be the portion of a sound into which you wish to synthesize new sound. ------------------------------------------------------------------------------- LPC resynthesis... prompts user for parameters for resynthesis of sounds from existing LPC data. The voiced threshold is the frame error value below which the resynthesized sound will be entirely pitched material; the unvoiced threshold is the frame error value above which it will be entirely unpitched (noise) material. The pitched/unpitched ratio allows adjustment of the mix -- the default is a good starting point. The warp factor allows shifting of the formant peaks up or down. The user has a choice of LPC frame interpolation methods; linear is faster, but recalculated produces smoother results, and is useful when stretching a small number of LPC frames over a large amount of sound. See Appendix B for more information. Phase Vocoder resynthesis... prompts user for parameters for resynthesis of sounds from existing Phase Vocoder data. See Appendix C for more information. ------------------------------------------------------------------------------- Converter Submenu (Sounds only) ------------------------------------------------------------------------------- NeXT, SPARC, SGI, etc. items show the choice of D/A converters, if any. converter settings... allows user to set converter parameters such as record and play levels, depending on the platform. reset converter reset the converter to its pre-initialized state. This is for use after a converter initialization failure. ------------------------------------------------------------------------------- Analysis Menu (Sounds only) For all analysis techniques that involve frame rates and offsets, the dialog panel works as follows: if the frame rate value is left at zero, the value for the frame offset will be used. If the frame rate value is nonzero, the frame offset value will be ignored, and will be set to samplerate/framerate. ------------------------------------------------------------------------------- show maxamp sample location sets the cursor to the location of the peak sample in the file extract amplitude envelope extracts an N-point amplitude envelope curve from the selected region. Amplitudes may be linear, RMS, or decibel. This will be stored in an Envelope data type display. FFT analysis runs a Fast-Fourier Transform on a selected region with various size transforms and frame offsets. This FFT display is for analysis purposes only and cannot currently be used for any other purpose in MiXViews. LPC analysis runs a Linear-Predictive Coding analysis on the selected region. This uses the LPC program developed by Paul Lansky at Princeton. A dialog window will let you set the number of poles and the frame size and offset. See appendix B for further information. extract pitch envelope runs a pitch tracking analysis on the selected region. Pitch range should be set to approx. 10% about and below the expected range of fundamentals. The RMS amplitudes of the frames are also displayed. Pitch tracking information is used in conjunction with LPC analysis; The pitch channel (channel 2) is usually merged into the 4th channel of an LPC data file (see the merge pitch data command under LPC menu). The user can choose whether to pre-filter the data with a bandpass filter (slower, but much better frequency accuracy) or a lowpass filter (faster, but not useful at low or high frequencies). Phase Vocoder analysis runs a Phase Vocoder analysis on the selected region. Input frame size is the number of samples to analyze per frame. Larger values will give more precise spectral peaks, but tend to produce blurred sounds if the timbre changes quickly. Input frame offset is the number of samples to shift over for each frame. This defaults to framesize/8 unless a non-unity time scaling factor is used. Input frame rate if nonzero, sets the offset to samplerate/framerate samples. Time scaling factor, if non-unity, changes the frame offset. This is used to increase the number of frames per second by a given factor, so that when a sound is resynthesized, it can be stretched by that same time factor without significant windowing distortion. See appendix C for further information. ------------------------------------------------------------------------------- LPC menu (LPC Datafiles only) ------------------------------------------------------------------------------- stabilize frames runs a stabilization algorithm on the entire datafile which tries to eliminate all frames with unstable coefficients, i.e., those that would produce infinite amplitudes during resynthesis. This is highly recommended! display filter amplitudes produces an envelope window displaying the relative amplitudes generated by each filter frame as it is interpolated into the following frame (10 frames per original). This is useful for detecting "bad" frame interpolations. display filter formants produces a more elaborate fast-Fourier transform of each frame, so that the formant peaks of each filter frame may be examined. merge pitch data copies the pitch channel information from a Pitch Track Analysis file into the fourth channel of the LPC data file. This is necessary because the LPC analyzer does not do this on its own. In most circumstances, the pitch analysis should be from the same region of the sound as the LPC analysis, with the same frame offset or frame rate settings. adjust pitch deviation... the total amount of pitch variance around the average pitch value can be adjusted. A threshold is entered first; frames with error thresholds exceeding this value will be ignored by the process. These are usually unpitched frames. This allows you to effectively smooth or exaggerate the inflections within a given pitch curve. change sample rate... allows user to change the value of the sample rate for the data. This is only of occasional interest. change file length... change the overall length of the current file lpc options... set the default format for lpc data files, including frame rate, number of filter poles, and header format. These defaults will be used in the "Save to" panel and the "New LPC Data" panel. NOTE: The default sampling rate will be set to the default sound sampling rate (see above). ------------------------------------------------------------------------------- Pitch menu (Pitch Track Datafiles only) ------------------------------------------------------------------------------- change file length... change the overall length of the current file shift by pitch interval... currently not implemented ------------------------------------------------------------------------------- PVoc menu (Phase Vocoder Datafiles only) ------------------------------------------------------------------------------- harmonically shift spectrum allows user to multiply any portion of the Phase Vocoder analysis by an arbitrary factor, effectively transposing that portion of the spectrum. Any envelope may be used to map this factor, if desired. A new pvoc file will be created by this command, and the original will be untouched. stretch/shrink shift spectrum works like the previous command, but the shift factor varies linearly with the frequency, converting harmonic spectra into enharmonic ones. Any envelope may be used to map this factor, if desired. A new pvoc will be created by this command, and the original will be untouched. change file length... change the overall length of the current file. User will be warned if reducing the length (which will destroy data). pvoc options... currently not implemented. ------------------------------------------------------------------------------- Envelope menu (Envelopes only) ------------------------------------------------------------------------------- create linear curve... creates a linear slope over the selected region create exponential curve... creates an exponential curve. The user can select the exponent for the curve. invert existing curve currently not implemented. change file length... change the overall length of the current file. User will be warned if reducing the length (which will destroy data). envelope options... currently not implemented. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Appendix Entries ------------------------------------------------------------------------------- Appendix A: Mxv defaults and X resources ------------------------------------------------------------------------------- X Resources: Mxv is build on top of the InterViews toolkit. As with other X toolkits, a large number of resource values may be set by including lines in your .Xresources or .Xdefaults file (see X windows manuals for a full explaination of this procedure). This diagram shows the toolkit hierarchy within mxv. The entries below are in addition to all those that are available for the InterViews toolkit. For the most part, these resource values affect the visual appearance of the program only. [Note from the editor: I realize this is VERY complicated -- I will eventually include a section from the InterViews manual here to help explain this. The default settings of everything should be sufficient for now. -DAS ] For example, to set the background color of the horizontal scale in the LPC data display window to Blue, put a line like the following in your .Xresources file: mxv*LPCWindow*HScale*HorizontalScale*background: Blue This same resource can be set on the command line using the -xrm option: % mxv -xrm "*LPCWindow*HScale*HorizontalScale*background:Blue" ---- Key to diagram: ProgramName *ResourceNode <*GenericResourceNode> *GenericResourceName: *GenericResourceName: *GenericResourceName: *ResourceNode <*GenericResourceNode> *ResourceNode ("Specific Instance Name") *ResourceName: [default value] ------------------------------------------------------------------------------- MiXViews *AutoPlaceWindows: [false] *BrowserUseLastPath: [false] *ReadRawFiles: [false] *SoundWindowDisplayChannels: [4] *LPCWindowDisplayChannels: [4] *PitchWindowDisplayChannels: *FFTWindowDisplayChannels: *EnvelopeWindowDisplayChannels: *PvocWindowDisplayChannels: [4] *DefaultSoundFileDir: *DefaultLPCFileDir: *DefaultPitchFileDir: *DefaultFFTFileDir: *DefaultEnvelopeFileDir: *DefaultPvocFileDir: *StatusBar *FramedWindow <*TextWindow> <*DataWindow> *MenuBar *PulldownCommandMenu <*PulldownMenu> [menu titles] *Command <*MenuItem> [command names] *PullrightCommandMenu <*PullrightMenu> *Command <*DataView> *HScale *VMessage ("HorizontalScaleLabel") *HScaleMarks ("HorizontalScale") *padding: *borderWidth: *allowShrink: *HBorder *ViewScaler *VBorder *StatusPanel ("Edit Start: ") *StatusPanel ("Edit End: ") *HorizontalViewScroller *VScale *VMessage ("VerticalScaleLabel") *VScaleMarks ("VerticalScale") *padding: *borderWidth: *allowShrink: <*Graph> *PlotWidth: *PlotHeight: *ChannelView <*DataView> *ChannelGraph <*Graph> *PlotStyle: [bar] *FrameView <*DataView> *VerticalViewScroller *VBorder *FrameGraph <*Graph> <*DialogBox> *Message ("Title") *Message ("Subtitle") *ResponseButton (button title) *Message ("ChoiceButtonTitle") -*RadioButton or -*CheckBox *Alert <*DialogBox> ("AlertPanel") *Confirmer <*DialogBox> ("ConfirmPanel") *ChoiceDialog <*DialogBox> ("ChoicePanel") *InputDialog <*DialogBox> ("InputPanel") *Message ("TextEntryLabel") *TextInput <*StringEditor> *ValueSlider *Message ("TextEntryLabel") *TextInput <*StringEditor> *NumberLabel *ScrollerBar *SoundWindow <*DataWindow> *LPCWindow <*DataWindow> *PitchWindow <*DataWindow> *FFTWindow <*DataWindow> *EnvelopeWindow <*DataWindow> *PvocWindow <*DataWindow> -------------- Mxv Defaults: All non-visual application default settings are in the process of being moved to a new .mxvrc file which should be located in the user's home directory. The format of this file is: DefaultName1 DefaultValue1 DefaultName2 DefaultValue2 ... Entries must be one pair per line, and may be separated with spaces and/or tabs. No other characters (i.e., no *'s or :'s) should appear in these lines. At the present moment, only the following defaults may be set in this file: AutoPlaceWindows BrowserUseLastPath ReadRawFiles DefaultSoundFileDir DefaultLPCFileDir DefaultPitchFileDir DefaultFFTFileDir DefaultEnvelopeFileDir DefaultPvocFileDir These may also be set in the .Xdefaults file as described above, but values set in the .mxvrc file will override any settings in the other file. ------------------------------------------------------------------------------- Appendix B: Quick Overview of Linear Predictive Coding (LPC) Analysis ------------------------------------------------------------------------------- LPC analysis, which originally was used for statistical analysis, proved useful in computer music because of its ability to extract and store time-varying formant information. Time varying means that the information changes over time, like the amplitude of a waveform does. Formants are points in a sound's spectrum where frequencies are boosted. In the real world this is often due to natural resonance in the object that is vibrating. The difference in the sounds of spoken vowels such as 'a' and 'e' are due to differences in the formant peaks caused by the difference in the shape of your mouth when you produce the sounds. The data generated by an LPC analysis of a sound consists primarily of filter coefficients which, if used to control a specific type of filter, will alter an input sound's spectrum to match the formant peaks of the original sound. If this input sound is a raw pulse waveform (which contains all harmonics at equal amplitudes), the resultant filtered sound timbre will be very close to the original. This is the basic procedure for the LPC Resynthesis command. Typically, the original sound to be analyzed is a vocal sound, which can then be resynthesized with various parameters (such as pitch or duration) changed. The Formant Filter command allows for the filtering of any arbitrary input sound, which "maps" the formant peaks onto that sound. An additional 'warping' parameter is also available, with values between -1 and 1, with 0 being no warping. This factor has the effect of shifting the formant peaks down or up in frequency, thereby radically altering the timbre. The best values are in the range +-.01 to +-.5. In the LPC analysis command, the number of poles specifies the accuracy of the analysis: the greater the number of poles, the more precisely the format regions will be captured. Typical values range from about 24 for 22khz sounds up to 64 for 44khz and 48khz sounds. ------------------------------------------------------------------------------- Appendix C: Quick Overview of Phase Vocoder (PVoc) Analysis ------------------------------------------------------------------------------- Phase vocoder analysis creates a data file containing frames of information representing the frequencies and amplitudes of those frequencies for successive "time slices" of a given sound. These slices usually overlap, and when a sound is resynthesized from a PVoc datafile, it is usually possible to produce an exact replica of the original -- depending on the accuracy of the original analysis. This is very different than LPC analysis, which only extracts formant peaks. PVoc analyses contain complete information about the spectral composition -- in essence, a blueprint -- of a sound. This blueprint may be altered in an infinitude of ways prior to performing the resynthsis, allowing for an infinite range of possible resynthesized sounds. The size and spacing of the "slices" is determined at the time of the analysis; typically the slices are 512 samples long, and are spaced at 512/8 or 64 samples apart, or about 689 frames per second. ------------------------------------------------------------------------------- Appendix D: Dialog Panels ------------------------------------------------------------------------------- Dialog panels are used to display information and error or alert conditions, as well as to confirm certain kinds of actions and allow choices to be made. Another set of dialog panels allow the user to input information for various types of operations. In panels which contain text-entry items (places where the user can type text), the user may switch between items using the character. If a single text item is present, a will redisplay the value, which is very useful for items with bounded values (see below). The standard kill-word and kill-line key commands can be used to edit text (i.e., ^W and ^U), as well as deleting with the backspace or delete key. Hitting will usually activate one of the buttons at the bottom of the panel -- always the one in boldface type. In panels offering a choice (i.e., yes-no-cancel or confirm-cancel) the user may use the keyboard to operate the buttons: typing 'y' for confirm or yes, 'n' or 'c' for no or cancel. Text entries left blank will reset to the last successfully entered value. Illegal characters (like letters in a numeric entry item) produce a beep from the system. Most of the text-entry items are "bounded", i.e., there is a specific range of legal values. Some of these display their bounds visually with a slider and associated end labels showing the min and max values. These are usually very limited range items, like -1 to +1, or some specific integer range like 1 to 32. Other more general bounds, such as all positive integers, or all non-negative numbers, do not get displayed -- but if an out-of-bounds value is entered, the actual parameter value will be adjusted to fit the boundry conditions. This greatly reduces the number of "invalid parameter" error messages produced by the program. Hitting a after entering a value will display the actual parameter value to be used. The panels are designed to not interact with the X Window Manager, i.e., you are not supposed to be able to iconify or hide a dialog -- only enter values and then confirm or cancel. The "Dialog Panels Ignore Window Manager" option in the global options panel (see above) allows a choice because some window managers (specifically the native SGI 4dWM and the Sun olwm) will not allow keyboard focus if the dialog panels completely ignore the window manager. ------------------------------------------------------------------------------- Appendix E: Playing and Recording Sounds ------------------------------------------------------------------------------- Mxv is able to use the digital to analog converter hardware on several different platforms. On machines which allow only a limited number of sound formats to be played (such as u-law only or short int only), mxv will automatically convert the selection you desire to play into an appropriate format (for example, converting floating point samples into short integers or u-law). It will not, however, alter the sampling rate or number of channels, so you will still get an error message if these additional parameters do not match the converter specs. To record into a sound object, all parameters must match -- mxv will not attempt to adjust these. To stop a sound during play or record, you cannot use the "stop" menu command except on a NeXT workstation. This is due to the way mxv processes user input events. Instead, use the keyboard equivalent or , whichever is more convenient. If the converter device initialization fails, the converter will switch to an inactive state until the user either switches converters or uses the "reset converter" command (see above). Note that mxv records and plays from virtual memory -- there is no disk i/o involved in the process. Recording and playing only modify the currenly selected section of the soundfile. ------------------------------------------------------------------------------- Appendix F: Applying a Fade-out Right now there is no automatic way to apply fadeouts. Here is the best way: Go into the File menu and select "new envelope" under New Type. Use the default values it gives. When the window comes up, select the whole envelope (right mouse button click) and then go into the Envelope menu and select "create exponential curve". Have it start at 1 and go to 0. Leave the exponent at the default. Now, reselect the whole envelope, and then go back to your sound, select the section over which you want the sound to fade out, and use the "Apply Envelope" command from the Modify menu. Presto. ------------------------------------------------------------------------------- Last updated April. 19, 1995