JavaScript Linker (JSL) - Alpha 1 : Readme ------------------------------------------------------------------------ Copyright (c) 2004-2006, The Dojo Foundation, All Rights Reserved Contents * 1.0 Overview * 2.0 Requirements * 3.0 Installation Instructions * 4.0 Documentation 1.0 Overview The JavaScript source code can be represented in different levels of granularity. The JavaScript Linker uses the Abstract Syntax Trees (ASTs) representation, which represents the lowest level of detail, to model the source code. One of the main task for this project was to write a JavaCC compatible grammar that strictly follows the ECMA Specification. JavaCC uses this grammar to build a custom parser than can read and analyze the JavaScript source, which in turn, is used to build the JavaScript Linker. The purpose of JavaScript Linker is to process HTML/JavaScript code base to prepare code for deployment by reducing file size, create source code documentation, obfuscate source code to protect intellectual property, and help gather source code metrics for source code analysis & improvements. The source code modifications can either be made in place by overwriting the input files, or can be saved to a user-specified output directory. This latest alpha release of the JavaScript Linker uses the new ECMA grammar (supports ECMA-262 Standard 3rd edition). This release is meant for testing purposes only. Currently Supported Tasks This is the list of JavaScript Linker tasks supported in this release: 1. *Import* - Import JavaScript file from Html documents Import task finds and imported all input source files which have not been explicitly declared by the user, but are referenced with src attributes in HTML. 2. *Require* - Import source files specified by the require statements included in the Dojo source code. Require task helps process require statements referenced in the Dojo source code. This task, like the ant build scripts included with Dojo, helps in constructing a custom profile which includes only those modules used by your application. This tool automatically processess the require statements from the Dojo source without any intervention from the user. 3. *Janitor* - unused function removal via dependency analysis Janitor task is used to strip out unused functions from the JavaScript source code. Janitor performs a static code analysis constructing a function call graph for all global functions. Entry points are also calculated from all source files that have imported after processing the Dojo require statements. Every function not reachable from the graph is considered unused and gets removed. There are two cases where the analysis needs help from the user: Functions that are only called by the server (through the pipe), and functions that are composed with string concatenation of the function name which then gets passed to eval or similar reflective functionality. The user can enumerate the function names in these cases in a property that declares them protected from removal. The entry points are calculated from all global statements in all JavaScript source code visible to the tool in that run. This might not be desirable, so there is a property that when set to true makes the task only consider JavaScript code that was actually imported in an import statement in an HTML file. * *Known Issues:* The current version of the code is not very aggressive in removing unsed function. It always errs on the side of caution. However, you can control how aggresive Janitor is by changing a property called /'task.janitor.process.global'/ in the project file. Setting this property helps remove more unused functions but it also has the potential to break some test cases. * HTML event handlers are *ignored* and they are *not used* as entry points into the call graph. This feature is disabled because the html parser used with the earlier version had a incompatible licence. This feature will be supported after the work on new html grammar is complete. Janitor task will be improved incrementally in the later releases. 4. *Muffler* - assert/alert, developer "noise" removal Muffler is used to remove developer noise, like alert and assert statements. For specified identifiers that match declared global functions, the function declarations themselves are removed. Examples of statements that are removed are: assert( foo < 3 ); alert( "this is a fire drill" ); It can also removes code that cannot be reached if certain specified identifiers have declared Boolean values. For example, by declaring the identifier debug = false, the code inside the following if-statement cannot be reached, and so is stripped: if( debug ) { alert( "here" ); } 5. *Pretty Printer * ? Writing back the results from other tasks Print writes out the result from the other tasks and strips out whitespace, newlines and/or comments if desired. If this task is omitted from the task list, then the run is like a dry-run that won't write out anything. The user can look in the log files to check that the run is doing what is expected, and then add this task to the task list to do the actual writing out of the results. The print result is written out in an output directory and the files are written in a directory tree structure that is identical to the input directory structure. Since the input can be a list of input directories, the output tree structure will start at the point where the input directories differ (the common prefix is not mirrored). The output can also be done in place by specifying a property. By specifying pretty-printing properties, one can control the stripping of newlines, whitespace and comments. * *Known Issues:* This version does not support stripping of newline characters Additional Tasks This is a list of JavaScript Linker tasks that will supported in future releases: 1. *Metrics* ? Source code metric analysis during a JavaScript Linker run 2. *Lint* - Checks JavaScript and HTML input for known problems 3. *Jammer * - concatenates individual JavaScript files for custom builds/packaging 4. *Jabber* - Obfuscate JavaScript source code 5. *Vorpal* - Deobfuscate previously obfuscated source code 6. *Ogredoc* - Generates HTML documentation from the JavaScript source code 2.0 Requirements 1. JDK 1.5.x installed with *JAVA_HOME* pointing to that JDK. 2. You will need Apache Ant 1.6.x installed with *ANT_HOME* set. 3.0 Installation Instructions 1. Download JSL from the SVN Repository: svn co http://svn.dojotoolkit.org/dojo/trunk/tools/jslinker 2. Edit the included build.properties and set the location for the *ANT_HOME* property 3. Then build the project using: ant dist 4. There are 8 test cases bundled with this release. To run each test case using ant, navigate to *'jsl/bin'* and type: ant test1 Test cases test1 through test8 are available for testing. 5. To run the tasks from the command line, navigate to *'jsl/bin'* and type: java -Xms8m -Xmx200m -cp jsl.jar;sisc.jar;bcel.jar org.dojo.jsl.top.Top --verbose --prj jsl.prj --sources ../tests/test_Colorspace.html 6. You can also use the included shell script: jsc/bin/jsl --verbose --prj jsl.prj --sources ../tests/test_Colorspace.html 7. After the JavaScript Linker run, by default: * Modified files are written to the *'jsl/tmp'* directory. * Log files are written to the *'jsl/log'* directory. 4.0 Documentation JSL Options The following options are supported: *JSL command line options:* *-s or --sources* comma-separated list of directories or files (optional argument, default value is current directory, wildcards * and *?* are supported) *-e or --exclude* comma-separated list of path suffixes input source files that are ignored (wildcards *** and *?* are supported) *-o or --outputdir* output directory (optional argument, default value is current directory) *-t or --tempdir* temp directory (optional argument, default value is system temp directory) *-l or --logdir* log directory (optional argument, default value is current directory) *-j or --homedir* jsl home directory (optional argument, default value is user home directory) *-a or --tasks* comma-separated list of tasks <#Tasks> (required argument if argument *-p* is not present) *-p or --prj* name of property file (required argument if argument *-a* is not present, if there is a file called "jsl.prj" in the current directory it will be used in case -p is not specified) *-P or --prop* property key value pair separated by *=* *-v or --verbose* verbose mode (e.g. more output during tool run) *-h or --help* prints this help message It is recommended for normal project deployment use to create a property *.prj* file that defines all the properties and customizes the tool for the project. This is less awkward than creating long command lines with multiple properties. A template.prj file is provided for convenience. All the available properties are described there as well as in this document. At minimum one should define a list of input directories, an output directory and a list of tasks. If you name your property file "jsl.prj" and start jsl from the directory where that file lives it will pick it up automatically without needing a -p option at startup. Setting Up a Project File Open the template.prj file and set values for the properties defined there. The values you set will control what input <#SpecifyingInput> gets processed, what tasks get executed, and tailors the individual tasks to the needs of your project. The documentation for each property specifies valid property values, and default values when no value is specified. Note: Don't use quotes for any specified values. Specifying Input The input HTML and JavaScript files should contain valid JavaScript and HTML (i.e. follow standards). If it happens that the tool cannot parse a certain file, it can be excluded from the run. Here are all the properties that specify input to the tool: JSL Input Properties: jsl.sources The list of input source directories and files. (optional property, default value is the directory where JSL was started) *specified with* command line option *-s* or *--sources* *value* is a comma-separated list of directories and files Example: jsl.source.dirs = /projects/framework/content,/projects/music/content/script * *jsl.sources.encodings* Marks sources as having a specific encoding. Each property ends with the name of the encoding. *value* is a comma-separated list of filename patterns. *Example:* *jsl.rcoder.sources.encodings.Big5=**/src/localized/chinese/** *jsl.sources.unparsable* Marks sources as unparsable by jsl. This means that jsl will not attempt to parse them but will consider them as immutable input when executing tasks (for example things referenced in those files are still protected from janitor deletion). *value* is a comma-separated list of filename patterns. *Example:* * jsl.rcoder.sources.unparsable=**/src/heavy_jsp/* * *jsl.sources.html.suffixes* Specifies which suffixes are used for HTML files. (optional property, default value is html, htm, jhtml, sxi, jsi, adp. jsp) *value* is a comma-separated list of suffixes. *jsl.sources.js.suffixes* Specifies which suffixes are used for JavaScript files. (optional property, default value is |js|). *value* is a comma-separated list of suffixes. *jsl.sources.exclude* The list of input source entities to ignore. (optional property) *value* is a comma-separated list of filename patterns *Example:* * jsl.source.exclude = main.adp,header.adp * Both these properties support the standard file system wildcards *** and *?*. For the first property, only the last part of a path can have wildcards. For the second property, any part can have wildcards. For certain tasks it is desirable to only consider HTML files and the files that those HTML files import with src attributes (jammer2 and janitor can operate in this mode). To resolve paths specified in src attributes in HTML to real files on local disks, the tool needs to know the Web root directory specified with this property: *jsl.web.root* The Web root directory; needed to resolve absolute paths in src attributes in HTML. (optional property, default value is the directory where JSL was started) *value* is a directory *Example:* * jsl.web.root = /projects/framework/content * For cases where the Web structure has url mappings to several directories, the following property can be used: *jsl.web.maps* Resolves absolute paths in src attributes in HTML. (optional property) * value* is a comma-separated list of key-value pairs *keys* are url prefixes, *values* are directories on the local file system *Example:* * jsl.web.maps = /fw,dig/framework/client/content,/fw/images,dig/images,/ music,dig/client/music* *File Patterns* File patterns as property values are very useful to conveniently declare a set of files without having to list each file in the set. JSL supports the classic file system patterns |*| and |?| and also supports the |**| pattern. The *ANT* <#ANT> documentation explains patterns best: *Patterns* Patterns are used for inclusion and exclusion. These patterns look very much like the patterns used in DOS and UNIX: |*| matches zero or more characters, |?| matches one character. *Examples*: |**.java*| matches |.java|, |x.java|and |FooBar.java|, but not |FooBar.xml| (does not end with .java). |*?.java*| matches |x.java|, |A.java|, but not |.java| or |xyz.java| (both don't have one character before |.java|). Combinations of *'s and ?'s are allowed. Matching is done per-directory. This means that the first directory in the pattern is matched against the first directory in the path to match, then the second directory is matched, and so on. For example, if the pattern is |/?abc/*/*.java| and the path is |/xabc/foobar/test.java|, the first |?abc| is matched with |xabc|, then |*| is matched with |foobar|, and finally |*.java| is matched with |test.java|. They all match, so the path matches the pattern. To make things a bit more flexible, we add one extra feature, which makes it possible to match multiple directory levels. This can be used to match a complete directory tree, or a file anywhere in the directory tree. To do this, |**| must be used as the name of a directory. When |**| is used as the name of a directory in the pattern, it matches zero or more directories. For example: |/test/**| matches all files/directories under |/test/|, such as |/test/x.java|, or |/test/foo/bar/xyz.html|, but not |/xyz.xml|. The properties that accept values with file patterns are: *jsl.sources* (X) *jsl.sources.exclude jsl.sources.fw jsl.sources.encodings* (X) /*Note:* This property can only have |*| and |?| in the last directory part and no |**| pattern./ *File Encodings* JSL reads an application's source code with the standard ISO-8859-1 encoding. This is usually sufficient for correctly interpreting the data in the files, and correctly writing the results out after processing. However, international application source code can have files localized for particular languages that need different encodings. JSL supports these encodings by correctly reading the files using them, and respecting the encoding when writing out. Because it is impossible to guess the encoding of a text file from the stream of data the file contains, it is necessary to specify to JSL with a set of properties any file encoding that differs from the default ISO-8859-1 encoding. For example, to specify a group of files having the chinese encoding |Big5| the encodings property would be set as follows: |jsl.sources.encodings.Big5=**/src/localized/chinese/*| All the charset encoding names supported by java j2se 1.5 are legal encoding names (see Encoding in JDK 1.5 ). Tasks The list of tasks includes: import <#Import>, require <#Require>, print <#Print>, janitor <#Janitor> , muffler <#Muffler>. More tasks will follow. The print <#Print> task writes out the results. The import <#Import> task finds input source files which have not been explicitly declared by the user, but are referenced with src attributes in HTML. The import task is usually the first, and the print task is usually the last. The order in which the tasks are specified is important because the tasks form a pipeline, and the output of one task is the input to the next. The *-a* option (or the *jsl.tasklist* property ) should be a comma-separated list of task names. * Example command line:* *-a import,muffler,janitor,jammer2,jabber,print * The task list can also be specified in the property file: *jsl.tasklist* property Tasks to be executed by JSL in this run (required property) *value* is a comma-separated list of task names *Example:* * jsl.tasklist = import,muffler,janitor,jammer2,jabber,print* *Short description of each task and its properties.* *Import* Import task finds input source files which have not been explicitly declared by the user, but are referenced with src attributes in HTML. *Require* Require task helps process require statements referenced in the Dojo source code. This task, like the ant build scripts included with Dojo, helps in constructing a custom profile which includes only those modules used by your application. This tool automatically processess the require statements from the Dojo source without any intervention from the user. *Janitor* Use janitor task to strip out unused functions from the JavaScript source code. Janitor performs a static code analysis constructing a function call graph for all global functions. Global statements are considered entry points into the call graph. Every function not reachable from the graph is considered unused and gets removed. There are two cases where the analysis needs help from the user: Functions that are only called by the server (through the pipe), and functions that are composed with string concatenation of the function name which then gets passed to eval or similar reflective functionality. The user can enumerate the function names in these cases in a property that declares them protected from removal. The entry points are calculated from all global statements in all JavaScript source code visible to the tool in that run. This might not be desirable, so there is a property that when set to true makes the task only consider JavaScript code that was actually imported in an import statement in an HTML file. Here are all the properties for the janitor task that can be specified in a property file: * Janitor Properties:* * task.janitor.entries* The global entry points into the code (things that are used, but not called explicitly in code). These are the identifiers that should be protected from removal but won't get revealed as such by the call graph analysis. Entries should be simple identifiers, not composite names. For example to protect |MUMusic.render| the entry |render| should be added to the list. Simple wildcards are supported: * and ?. They have the same meaning as in file system wildcards. (optional property) * value* is comma-separated list of identifiers or identifier wildcards *Example:* * task.janitor.entries = foo,bar,handle*,* *task.janitor.process.js.imports.only* The flag that controls whether only JavaScript source files which are actually imported by HTML files are considered when running janitor (optional property, default is false) * value* is true or false */ Note:/*/ When setting the second property to true, the tool will need the webroot property set to a valid directory./ *Muffler* Muffler task removes developer noise like alert and assert statements. For now, only explicit function calls are being removed. For specified identifiers that match declared global functions, the function declarations themselves are removed. Examples of statements that are removed are: assert(foo < 3); alert("this is a fire drill"); The identifiers can be composite names like *dig.debug.log* and can have wildcards in them. The wildcard syntax is the familiar syntax from file system wildcards, with *** and *?* augmented with an additional wildcard pattern **** for any number of segments in a composite identifier. * Example:* * dig.log.** matches dig.log.foo, dig.log.info, dig.log.info.warning etc.* Muffler also removes code that cannot be reached if certain specified identifiers have declared Boolean values. For example, by declaring the identifier *debug = false*, the code inside the following if-statement cannot be reached, and so is stripped: if ( debug ) { alert ( "here" ); } Here are all the properties for the muffler task that can be specified in a property file: * Muffler Properties:* * task.muffler.noise* The function names that need to be deleted (required property for this task * value* is comma-separated list of identifiers *Example:* * task.muffler.noise = assert,alert* *Print* Print task writes out the result, stripping whitespace, newlines and/or comments if desired. If this task is omitted from the task list, then the run is like a dry-run that won't write out anything. The user can look in the log files to check that the run is doing what is expected, and then add this task to the task list to do the actual writing out of the results. The exception is the ogredoc task that generates output without the help of this task. The print result is written out in an output directory and the files are written in a directory tree structure that is identical to the input directory structure. Since the input can be a list of input directories, the output tree structure will start at the point where the input directories differ (the common prefix is not mirrored). The output can also be done in place by specifying a property. By specifying pretty-printing properties, one can control the stripping of newlines, whitespace and comments. Here are all the properties for the print task that can be specified in a property file: * JSL Print Properties:* * jsl.source.mirror* The flag that controls whether excluded files and source files without any defined suffix should be copied over (mirrored) into the output directory (e.g. files that the tool normally doesn't process such as jpg, gif images etc.) (optional property, default is false) *value* is a Boolean *task.print.output.dir* The output directory (optional property, default value is a date-stamped jslout subdir of the temp dir) * specified with* command line option *-o* or* --outputdir* *value* is a directory *task.print.inplace* If true, prints in place (optional property, default value is false) * value* is a Boolean The following properties are the Boolean properties for pretty-printing files of a certain suffix. Their meaning should be obvious from their name. * source.js.prettyprinter.strip.all source.js.prettyprinter.strip.comments source.js.prettyprinter.strip.whitespace source.js.prettyprinter.strip.newlines source.js.prettyprinter.preserve source.js.prettyprinter.indent source.html.prettyprinter.strip.comments* ------------------------------------------------------------------------ *Last Update:* 2006/08/25 23:49:07 *Author:* Satish Sekharan