Jekyll2020-03-03T04:08:10+00:00https://zenglix.github.io/feed.xmlLi ZengPowered by Jekyll and minimal-mistakes themeLi Zengli.zeng@yale.edusome BASH topics2019-01-11T00:00:00+00:002019-01-11T00:00:00+00:00https://zenglix.github.io/bash<p>The more I use bash the more I find it interesting. Basically every time I encounter a useful bash commands or when I learn something new about a command, I write them down for future reference.</p>
<p>Quick links:</p>
<ul>
<li><a href="#regex">Regular Expression</a></li>
<li><a href="#globbing">File Globbing</a></li>
<li><a href="#array">Bash Arrays</a></li>
<li><a href="#find">find</a></li>
<li><a href="#vim">vim</a></li>
<li><a href="#grep">grep</a></li>
<li><a href="#sed">sed</a></li>
<li><a href="#head">head,tail</a></li>
<li><a href="#others">Others</a></li>
<li><a href="#Link">Useful Links</a></li>
</ul>
<p>Online references:</p>
<ul>
<li><a href="http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/book1.htm">GNU/Linux Command-Line Tools Summary</a></li>
</ul>
<h2 id="regular-expressions"><a name="regex"></a>Regular Expressions</h2>
<p>Regular expressions(REGEX) are sets of characters and/or <em>metacharacters</em> that <strong>match patterns</strong> —- <a href="http://tldp.org/LDP/abs/html/x17129.html">REGEX intro</a>.</p>
<p><a href="https://www.youtube.com/watch?v=sa-TUpSx1JA">Video tutorial</a></p>
<p>####Escapes:
characters that have special meanning, to be escaped</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.[<span class="o">{()</span><span class="se">\^</span><span class="nv">$|</span>?<span class="k">*</span>+
</code></pre></div></div>
<h4 id="match-pattern">Match Pattern</h4>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">.</span> - any character except new line
<span class="se">\d</span> - Digit <span class="o">(</span>0-9<span class="o">)</span>
<span class="se">\D</span> - Not a Digit <span class="o">(</span>0-9<span class="o">)</span>
<span class="se">\w</span> - word Character <span class="o">(</span>a-z, A-Z, 0-9, _<span class="o">)</span>
<span class="se">\W</span> - Not work character
<span class="se">\s</span> - white spaces
<span class="se">\S</span> - not white space
<span class="c"># anchors, don't match any characters</span>
<span class="c"># match invisible positions</span>
<span class="se">\b</span> - Word Boundary
<span class="se">\B</span> - Not word Boundary
^ - beginning of a string
<span class="nv">$ </span>- end of string
<span class="c"># character set</span>
<span class="o">[</span>...] <span class="c"># match any one character in set</span>
- <span class="c"># specify range when used between number/letters</span>
<span class="o">[</span>^] <span class="c"># not in the set </span>
| <span class="c"># either or </span>
<span class="o">(</span> <span class="o">)</span> <span class="c"># Group</span>
<span class="c"># quantifier</span>
<span class="k">*</span> - match 0 or more
+ - match 1 or more
? - match 0 or One
<span class="o">{</span>3<span class="o">}</span> - match exact number
<span class="o">{</span>3,4<span class="o">}</span> - match a range of numbers <span class="o">(</span>Minimum, Maximum<span class="o">)</span>
</code></pre></div></div>
<h4 id="lookaround-lookahead-and-lookbehind">lookaround: lookahead and lookbehind</h4>
<p>Lookaround is an <strong>assertion</strong> (like line start or end anchor). It actually matches with characters, but then give up the match, and <strong>only returns match or no match</strong>. It <strong>does not consume characters</strong> in the string.</p>
<p>Basic syntax:</p>
<ul>
<li>lookahead:
<ul>
<li>positive lookahead: <code class="language-plaintext highlighter-rouge">(?=(regex))</code></li>
<li>negative lookhead: <code class="language-plaintext highlighter-rouge">(?!(regex))</code></li>
</ul>
</li>
<li>lookbehind:
<ul>
<li>positive: <code class="language-plaintext highlighter-rouge">(?<=(regex))</code></li>
<li>negative: <code class="language-plaintext highlighter-rouge">(?<!(regex))</code></li>
</ul>
</li>
</ul>
<p>Example:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// match "book" with "the" after it</span>
<span class="n">book</span><span class="p">(</span><span class="o">?=</span><span class="p">.</span><span class="o">*</span><span class="n">the</span><span class="p">)</span>
<span class="c1">// match "book" with "the" before it</span>
<span class="p">(</span><span class="o">?<=</span><span class="n">the</span><span class="p">.</span><span class="o">*</span><span class="p">)</span><span class="n">book</span>
</code></pre></div></div>
<h4 id="more-examples">More Examples:</h4>
<ul>
<li>word boundary: <code class="language-plaintext highlighter-rouge">\bHe</code>, # He eHe</li>
<li>character set: <code class="language-plaintext highlighter-rouge">a[de]c</code>, adc, aec</li>
<li>dash for range: <code class="language-plaintext highlighter-rouge">[a-z0-9A-Z]</code></li>
<li>not in set: <code class="language-plaintext highlighter-rouge">[^1-3]</code></li>
<li>quantifier examples
<ul>
<li><code class="language-plaintext highlighter-rouge">\d{3}</code>: 123</li>
<li><code class="language-plaintext highlighter-rouge">Mr\.?\s[A-Z]\w*</code>: Mr. Zeng, Mr Zeng</li>
</ul>
</li>
<li>Group examples:
<ul>
<li><code class="language-plaintext highlighter-rouge">(Mr|Mrs)\.?\s\w+</code>: Mr. Zeng, Mrs Zeng …</li>
</ul>
</li>
</ul>
<h2 id="-file-globbing"><a name="globbing"></a> File Globbing</h2>
<p>File Globbing and REGEX can be confusing. REGEX is used in functions for <strong>matching text in files</strong>, while globbing is used by shells to <strong>match file/directory names</strong> using wildcards.</p>
<p>Wildcards (some in REGEX may also apply):</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">*</code>: match any string</li>
<li><code class="language-plaintext highlighter-rouge">{}</code> is often used to extend list, eg: <br />
<code class="language-plaintext highlighter-rouge">ls {a*,b*}</code> lists files starting with either <code class="language-plaintext highlighter-rouge">a</code> or <code class="language-plaintext highlighter-rouge">b</code>.</li>
<li>[]: same as in REGEX</li>
</ul>
<h2 id="-bash-arrays"><a name="array"></a> Bash Arrays</h2>
<ul>
<li>Arrays can be constructed using round brackets: <br />
<code class="language-plaintext highlighter-rouge">var=(item0 item1 item2)</code> or <br />
<code class="language-plaintext highlighter-rouge">var=($(ls -d ./))</code></li>
<li>To access items or change item values, we can use <code class="language-plaintext highlighter-rouge">var[index]</code>. Eg: <br />
<code class="language-plaintext highlighter-rouge">var[index]=new_value</code> <br />
<code class="language-plaintext highlighter-rouge">echo ${var[index]}</code> <br />
Note that when <code class="language-plaintext highlighter-rouge">var</code> is an array, the name <code class="language-plaintext highlighter-rouge">var</code> actually only refers to <code class="language-plaintext highlighter-rouge">var[0]</code>. To refer to the whole array, need to use <code class="language-plaintext highlighter-rouge">var[@]</code> or <code class="language-plaintext highlighter-rouge">var[*]</code>.</li>
<li>sub-array expansion:
<ul>
<li><code class="language-plaintext highlighter-rouge">${var[*]:s_ind}</code> gives the subarray starting from index <code class="language-plaintext highlighter-rouge">s_ind</code>.</li>
<li><code class="language-plaintext highlighter-rouge">${var[@]:s_ind:l}</code> gives you the length <code class="language-plaintext highlighter-rouge">l</code> sub-array starting at index <code class="language-plaintext highlighter-rouge">s_ind</code>.</li>
<li>Can also replace <code class="language-plaintext highlighter-rouge">@</code> with <code class="language-plaintext highlighter-rouge">*</code>.</li>
</ul>
</li>
</ul>
<h2 id="vim"><a name="vim"></a>vim</h2>
<ul>
<li>In normal mode:
all keys are functional keys. Examples are:
-<code class="language-plaintext highlighter-rouge">p</code>: paste
<ul>
<li><code class="language-plaintext highlighter-rouge">yy</code>: copy current row to clip board</li>
<li><code class="language-plaintext highlighter-rouge">dd</code>: copy row to clip board and delete</li>
<li><code class="language-plaintext highlighter-rouge">u (ctrl+R)</code>: undo (redo) changes</li>
<li><code class="language-plaintext highlighter-rouge">hjkl</code>: left, down, right, up</li>
<li><code class="language-plaintext highlighter-rouge">:help <command></code>: get help on a <code class="language-plaintext highlighter-rouge">command</code> — vim open the command txt file</li>
<li><code class="language-plaintext highlighter-rouge">:wq</code> or <code class="language-plaintext highlighter-rouge">:x</code>: <code class="language-plaintext highlighter-rouge">w</code> for save; <code class="language-plaintext highlighter-rouge">q</code> for quit</li>
<li><code class="language-plaintext highlighter-rouge">:q!</code>: quit without saving</li>
</ul>
</li>
<li>Insertion:
<ul>
<li><code class="language-plaintext highlighter-rouge">o</code>: insert a new row after current row</li>
<li><code class="language-plaintext highlighter-rouge">O</code>: insert a new row before current row</li>
<li><code class="language-plaintext highlighter-rouge">a</code>: insert after cursor</li>
<li><code class="language-plaintext highlighter-rouge">i</code>: insert at cursor</li>
</ul>
</li>
<li>Cursor movement:
<ul>
<li><code class="language-plaintext highlighter-rouge">0, :0</code>: beginning of row, page</li>
<li><code class="language-plaintext highlighter-rouge">$, :$</code>: end of row, page</li>
<li><code class="language-plaintext highlighter-rouge">^</code>: to first non-blank character</li>
<li><code class="language-plaintext highlighter-rouge">/pattern</code>: search for pattern (press <code class="language-plaintext highlighter-rouge">n</code> to go to next)</li>
<li><code class="language-plaintext highlighter-rouge">H,M,L</code>: move cursor to top, middle and bottom of page</li>
<li><code class="language-plaintext highlighter-rouge">Ctrl + E,Y</code>: scroll up, down</li>
<li><code class="language-plaintext highlighter-rouge">Ctrl + u,d</code>: half page up, down</li>
<li><code class="language-plaintext highlighter-rouge">w,W,e,E,b,B</code>: jump cursor by words</li>
</ul>
</li>
<li><a href="http://vim.wikia.com/wiki/Using_tab_pages"><strong>tabs</strong></a>:
<ul>
<li><code class="language-plaintext highlighter-rouge">:tabedit file</code>, <code class="language-plaintext highlighter-rouge">:tabfind file</code>: open new tab</li>
<li><code class="language-plaintext highlighter-rouge">gt</code>, <code class="language-plaintext highlighter-rouge">gT</code>: next, previous tab</li>
<li><code class="language-plaintext highlighter-rouge">:tabonly</code>: close all other tabs</li>
<li><code class="language-plaintext highlighter-rouge">:tabnew</code>: open empty new tab</li>
<li>can use abreviations, such as <code class="language-plaintext highlighter-rouge">:tabe</code>, <code class="language-plaintext highlighter-rouge">:tabf</code>, …</li>
<li><strong><code class="language-plaintext highlighter-rouge">:Explorer</code></strong>: explore folder with vim</li>
</ul>
</li>
<li><a href="http://vim.wikia.com/wiki/Search_and_replace"><strong>string substitution</strong></a>:
<ul>
<li><code class="language-plaintext highlighter-rouge">%s/pattern/replacement/g</code>: replace all occurrences</li>
<li><code class="language-plaintext highlighter-rouge">s/pattern/replacement/g</code>: replace in current line</li>
<li>flags:
<ul>
<li><code class="language-plaintext highlighter-rouge">g</code> for global</li>
<li><code class="language-plaintext highlighter-rouge">c</code> for confirmation</li>
<li><code class="language-plaintext highlighter-rouge">i</code> for case-insensitive</li>
</ul>
</li>
</ul>
</li>
<li><strong>Visual Mode</strong>
<ul>
<li>type <code class="language-plaintext highlighter-rouge">v</code> to enter visual mode</li>
<li>move cursor to select text</li>
<li><code class="language-plaintext highlighter-rouge">y</code>: copy</li>
</ul>
</li>
<li>Others:
-<code class="language-plaintext highlighter-rouge">:syntax on/off</code> : turn on/off text-highlighting colorscheme
-<code class="language-plaintext highlighter-rouge">:Explore .</code> or <code class="language-plaintext highlighter-rouge">:e .</code>: explore current folder</li>
</ul>
<h2 id="find"><a name="find"></a>find</h2>
<p>General syntax:
<code class="language-plaintext highlighter-rouge">find path -name **** -mtime +1 -newer 20160621 -size +23M ...</code>
We will introduce each of above parameters and some more in this section:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">find ./ -name "*.txt"</code> : searching by name</li>
<li>
<p><code class="language-plaintext highlighter-rouge">find ./ -type d -name "*LZ*"</code>: specify target type, <code class="language-plaintext highlighter-rouge">d</code> for directory, <code class="language-plaintext highlighter-rouge">f</code> for file.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">find ./ -newerct 20130323 (or a file) </code>:
file created <code class="language-plaintext highlighter-rouge">ct</code> after the date (also could be a file). can also use <code class="language-plaintext highlighter-rouge">newer</code> just for modified time</p>
</li>
<li><code class="language-plaintext highlighter-rouge">find ./ -mtime (-ctime, -atime) +n </code>:
<ul>
<li><code class="language-plaintext highlighter-rouge">m</code> for modified time</li>
<li><code class="language-plaintext highlighter-rouge">c</code> for creation time</li>
<li><code class="language-plaintext highlighter-rouge">a</code> for access time</li>
<li><code class="language-plaintext highlighter-rouge">+n</code> for greater than n days, similarly <code class="language-plaintext highlighter-rouge">-n</code> for within n days. Can also change measures</li>
<li>can also use <code class="language-plaintext highlighter-rouge">amin, cmin, mmin</code> for minutes</li>
</ul>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">find ./ -name "PowerGod*" -maxdepth 3</code>: <br />
set maximum searching depth in this directory; similarly use <code class="language-plaintext highlighter-rouge">mindepth</code> to set minimum searching depth</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">-iname</code> : to ignore case</p>
</li>
<li>piping results found:
<code class="language-plaintext highlighter-rouge">-exec cp {} ~/LZfolder/ \;</code>: this command will copy the finded files to path <code class="language-plaintext highlighter-rouge">~/LZfolder/</code>
<ul>
<li>finded file will be placed in the position of <code class="language-plaintext highlighter-rouge">{}</code> and execute the command</li>
</ul>
</li>
</ul>
<h2 id="grep"><a name="grep"></a>grep</h2>
<p><code class="language-plaintext highlighter-rouge">grep</code> is used for searching lines in a file with certain pattern strings.
General formula: <code class="language-plaintext highlighter-rouge">grep pattern filename</code> <br />
There are rich parameters you can specify:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">grep abc$ file</code>: match the end of a string</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">grep ^F file</code>: match the beginning of string</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">grep -w over file</code>: grep for words.
In this example, words such as <code class="language-plaintext highlighter-rouge">overdue, moreover</code> would be skipped.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">-A3</code>: also show 3 lines after the lines found</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">-B3</code>: show 3 lines before found lines</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">-C3</code>: show 3 lines before and after</p>
</li>
<li>
<p><strong>logical grep</strong>:</p>
<ul>
<li>OR grep: <code class="language-plaintext highlighter-rouge">grep pattern1|pattern2 filename</code></li>
<li>AND grep: <code class="language-plaintext highlighter-rouge">grep pattern1.*pattern2 filename</code></li>
<li>NOT grep: <code class="language-plaintext highlighter-rouge">grep -v pattern filename</code> <br />
where <code class="language-plaintext highlighter-rouge">-v</code> stands for invert match</li>
</ul>
</li>
</ul>
<h2 id="sed"><a name="sed"></a>sed</h2>
<p><code class="language-plaintext highlighter-rouge">sed</code> is short for <em>Stream EDitor</em>
General formula:
<code class="language-plaintext highlighter-rouge">sed 's/RegEx/replacement/g' file</code>
which will do the work of replacing <code class="language-plaintext highlighter-rouge">RegEx</code> with <code class="language-plaintext highlighter-rouge">replacement</code>.</p>
<ul>
<li>the separator <code class="language-plaintext highlighter-rouge">/</code> could be replaced by something like <code class="language-plaintext highlighter-rouge">_, |</code>
<ul>
<li>eg: <code class="language-plaintext highlighter-rouge">sed 's | age | year | ' file</code>, and would still work.</li>
</ul>
</li>
<li>
<p>simple back referencing, eg:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nv">$echo</span> what | <span class="nb">sed</span> <span class="s1">'s/wha/&&&/'</span> <span class="c"># input</span>
whawhawhat <span class="c"># output</span>
</code></pre></div> </div>
</li>
<li>
<p>more on back referencing, eg:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">echo </span>2014-04-01 | <span class="nb">sed</span> <span class="s1">'s/\(....\)-\(..\)-\(..\)/\1+ \2+\3/'</span>
2014+04+01
</code></pre></div> </div>
</li>
</ul>
<p>Things in <code class="language-plaintext highlighter-rouge">\(...\)</code> are referred. A dot $\cdot$ in Regex can signify any character. Useful to use dots to describe patterns.</p>
<ul>
<li>you can also <code class="language-plaintext highlighter-rouge">sed</code> multiple patterns separated by <code class="language-plaintext highlighter-rouge">;</code>, eg:
<code class="language-plaintext highlighter-rouge">sed s/pattern1/replace1/;s/pattern2/replace2/g < file</code></li>
</ul>
<h2 id="-head-tail"><a name="head"></a> head, tail</h2>
<h2 id="shell-scripting">Shell scripting</h2>
<h3 id="debugging">Debugging:</h3>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">bash (or sh) -v script.sh</code> : displays each command as the program proceeds</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">bash (or sh) -x script.sh</code> : displays values of variables as program runs</p>
</li>
</ul>
<h2 id="-others"><a name="others"></a> Others</h2>
<h3 id="boolean-value"><strong>Boolean value</strong></h3>
<p>You can try : <code class="language-plaintext highlighter-rouge">false; echo $?</code> The output is <code class="language-plaintext highlighter-rouge">1</code>, which means in bash shell: <br />
<code class="language-plaintext highlighter-rouge">1</code> for <code class="language-plaintext highlighter-rouge">false</code> <br />
<code class="language-plaintext highlighter-rouge">0</code> for <code class="language-plaintext highlighter-rouge">true</code></p>
<h3 id="different-parenthesis-and-brackets"><strong>Different parenthesis and brackets</strong></h3>
<p>See <a href="http://stackoverflow.com/questions/2188199/how-to-use-double-or-single-bracket-parentheses-curly-braces">Parenthesis difference</a>.</p>
<ul>
<li><strong>Double parenthesis</strong> (arithmetic operator) :
<ul>
<li><code class="language-plaintext highlighter-rouge">(( expr ))</code> : enables the usage of things like <code class="language-plaintext highlighter-rouge"><, >, <=</code> etc.</li>
<li><code class="language-plaintext highlighter-rouge">echo $(( 5 <= 3 ))</code>, and we get <code class="language-plaintext highlighter-rouge">0</code></li>
<li>arithmetic operator interprets <code class="language-plaintext highlighter-rouge">1</code> as <code class="language-plaintext highlighter-rouge">true</code>, and <code class="language-plaintext highlighter-rouge">0</code> as <code class="language-plaintext highlighter-rouge">false</code>, which is different from the <code class="language-plaintext highlighter-rouge">test</code> command</li>
</ul>
</li>
</ul>
<h3 id="braces"><strong>Braces</strong></h3>
<ul>
<li>Used for <strong>parameter expansion</strong>. Can create lists which are often used in loops, eg:</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">echo</span> <span class="o">{</span>00..8..2<span class="o">}</span>
00 02 04 06 08
</code></pre></div></div>
<h3 id="single-and-double-square-brackets"><strong>Single and double square brackets</strong></h3>
<p>Much of below is from <a href="http://stackoverflow.com/questions/3427872/whats-the-difference-between-and-in-bash">bash brackets</a>, and <a href="http://www.ibm.com/developerworks/library/l-bash-test/">bash test functions</a>.</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">[ expression ]</code> is the same as <code class="language-plaintext highlighter-rouge">test expression</code>. eg: <br />
<code class="language-plaintext highlighter-rouge">test -e "$HOME" </code> same as <code class="language-plaintext highlighter-rouge">[ -e "$HOME" ]</code> <br />
and both of them requires careful handling of escaping characters.</p>
</li>
<li>
<p>use <code class="language-plaintext highlighter-rouge">-a, -o</code> or <code class="language-plaintext highlighter-rouge">||, &&</code> for group testing. eg:</p>
</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># the following are the same</span>
<span class="nv">$ </span><span class="nb">test</span> <span class="nt">-e</span> <span class="s2">"file1"</span> <span class="nt">-a</span> <span class="nt">-d</span> <span class="s2">"file2"</span>
<span class="nv">$ </span><span class="nb">test</span> <span class="nt">-e</span> <span class="s2">"file1"</span> <span class="o">&&</span> <span class="nb">test</span> <span class="nt">-d</span> <span class="s2">"file2"</span>
<span class="nv">$ </span><span class="o">[</span> <span class="nt">-e</span> <span class="s2">"file1"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">[</span> <span class="nt">-d</span> <span class="s2">"file2"</span> <span class="o">]</span>
<span class="nv">$ </span><span class="o">[</span> <span class="nt">-e</span> <span class="s2">"file1"</span> <span class="nt">-a</span> <span class="s2">"file2"</span> <span class="o">]</span>
</code></pre></div></div>
<p>Note that <code class="language-plaintext highlighter-rouge">[ expr1 ] -a [ expr2 ]</code>, <code class="language-plaintext highlighter-rouge">[ expr1 && expr2 ]</code> results in error.</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">[[ expression ]]</code> allows you to use more natural syntax for file and string comparisons. If you want to compare number, it’s more common to use double brackets <code class="language-plaintext highlighter-rouge">(( ))</code>. <br />
eg. <code class="language-plaintext highlighter-rouge">[[ -e "file1" && -e "file2" ]]</code>. <br />
<code class="language-plaintext highlighter-rouge">[[ ]]</code> doesn’t support <code class="language-plaintext highlighter-rouge">-a, -o</code> inside.</li>
</ul>
<h3 id="quotes"><strong>Quotes</strong></h3>
<p>Things inside the same quote are considered as one variable.</p>
<ul>
<li>Single quotes: preserves whatever inside</li>
<li>
<p>Double quotes: do not preserve words involving <code class="language-plaintext highlighter-rouge">$ or \</code> and etc.</p>
<p>See <a href="http://stackoverflow.com/questions/6697753/difference-between-single-and-double-quotes-in-bash">Quotes difference</a> for more.</p>
</li>
</ul>
<h3 id="environment-variables"><strong>Environment Variables</strong></h3>
<ul>
<li><code class="language-plaintext highlighter-rouge">$PS1</code>: controls shell prompt</li>
<li><code class="language-plaintext highlighter-rouge">$PATH</code>: when shell receives non-builtin command, it goes into <code class="language-plaintext highlighter-rouge">$PATH</code> to look for it.</li>
<li><code class="language-plaintext highlighter-rouge">$HOME</code>: home directory</li>
</ul>
<h3 id="easy-command-substitute"><strong>Easy command substitute</strong></h3>
<p>Say my previous command is <code class="language-plaintext highlighter-rouge">vim project.txt</code>. Now I want to <code class="language-plaintext highlighter-rouge">open</code> this file instead of using <code class="language-plaintext highlighter-rouge">vim</code>. Then I can simply input:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="o">!</span>vim:s/vim/open
</code></pre></div></div>
<p>where <code class="language-plaintext highlighter-rouge">$</code> is the shell prompt. Basically this is performing <code class="language-plaintext highlighter-rouge">sed</code> on whatever the results are from <code class="language-plaintext highlighter-rouge">!vim</code>.</p>
<h3 id="redirection"><strong>Redirection</strong></h3>
<p>Bash shell has 3 basic streams: <code class="language-plaintext highlighter-rouge">input(0)</code>, <code class="language-plaintext highlighter-rouge">output(1)</code>, and <code class="language-plaintext highlighter-rouge">error(2)</code>. We can use <code class="language-plaintext highlighter-rouge">#number></code> to redirect them to somewhere else, eg:</p>
<ul>
<li>Input redirection:
<ul>
<li><code class="language-plaintext highlighter-rouge"><</code> or <code class="language-plaintext highlighter-rouge">0<</code></li>
<li><code class="language-plaintext highlighter-rouge">command << EOF</code>, and then manually input argument file, using <code class="language-plaintext highlighter-rouge">EOF</code> to end inputting (or use <code class="language-plaintext highlighter-rouge">ctrl + D</code>). <code class="language-plaintext highlighter-rouge"><<</code> is <code class="language-plaintext highlighter-rouge">here document</code> symbol.</li>
<li><code class="language-plaintext highlighter-rouge">command <<< string</code> : it’s <code class="language-plaintext highlighter-rouge">here string</code> symbol. Can input a one row string argument.</li>
</ul>
</li>
<li>output redirection:
<ul>
<li><code class="language-plaintext highlighter-rouge">></code> or <code class="language-plaintext highlighter-rouge">1></code>: redirect output</li>
<li><code class="language-plaintext highlighter-rouge">2></code>: redirect error log</li>
<li><code class="language-plaintext highlighter-rouge">2>1&</code> : direct stderr to stdout stream, copy where stdout goes. And <code class="language-plaintext highlighter-rouge">1>2&</code> means vice versa. Here the <code class="language-plaintext highlighter-rouge">>&</code> is a syntax to pipe one stream to another.</li>
<li><code class="language-plaintext highlighter-rouge">&> filename</code>: join stdout and stderr in one stream, and put in a file.</li>
</ul>
</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># input redirection</span>
<span class="nv">$ </span>./myprog < file1.txt
<span class="c"># output and err redirection</span>
<span class="nv">$ </span>./myprog arg1 <span class="o">></span> file.out
<span class="nv">$ </span>./myprog arg1 2> file.err
<span class="nv">$ </span>./myprog arg1 &> out_and_err
</code></pre></div></div>
<ul>
<li>Want no output <br />
Use <code class="language-plaintext highlighter-rouge">command > /dev/null 2>&1</code></li>
</ul>
<h3 id="bash_profile-profile-and-bashrc">~/.bash_profile, ~/.profile and ~/.bashrc</h3>
<p>These are files where you can personalize commands to be executed upon shell login.</p>
<p>A bash shell would look for <code class="language-plaintext highlighter-rouge">~/.bash_profile</code> first. If it does not exist, it executes <code class="language-plaintext highlighter-rouge">~/.profile</code>.</p>
<p>When you start a shell in an existing session (such as screen), you get an interactive, non-login shell. That shell may read configurations in <code class="language-plaintext highlighter-rouge">~/.bashrc</code>.</p>
<p>See discussions: <br />
<a href="http://unix.stackexchange.com/questions/38175/difference-between-login-shell-and-non-login-shell">login, non-login</a> <br />
<a href="http://superuser.com/questions/183870/difference-between-bashrc-and-bash-profile">different startup files</a></p>
<h3 id="command-substitution"><strong>Command substitution</strong></h3>
<p>If we want to use the output of <code class="language-plaintext highlighter-rouge">command 1</code> in a sentence, we can do it in the following two ways:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>... ... <span class="sb">`</span><span class="nb">command </span>1<span class="sb">`</span> ... ... <span class="c"># method 1 </span>
... ... <span class="si">$(</span><span class="nb">command </span>1<span class="si">)</span> ... ... <span class="c"># method 2 </span>
</code></pre></div></div>
<h3 id="resolve-symbolic-links">Resolve symbolic links</h3>
<p>Say <code class="language-plaintext highlighter-rouge">courses</code> is a symbolic link I created. If I <code class="language-plaintext highlighter-rouge">cd</code> this link, and then print working directory:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cd </span>courses
<span class="nv">$ </span><span class="nb">pwd</span>
/Users/lizeng/paths/courses
</code></pre></div></div>
<p>It’s showing the symbolic path, not the absolute path. To get the absolute path, we can resolve the link through:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">pwd</span> <span class="nt">-P</span>
/Users/lizeng/Google Drive/Yale/courses
<span class="c"># same also works for many other commands</span>
<span class="nv">$ </span><span class="nb">cd </span>courses<span class="p">;</span> <span class="nb">cd</span> ..<span class="p">;</span> <span class="nb">pwd</span>
/Users/lizeng/paths
<span class="nv">$ </span><span class="nb">cd </span>courses<span class="p">;</span> <span class="nb">cd</span> <span class="nt">-P</span> ..<span class="p">;</span><span class="nb">pwd</span>
/Users/lizeng/Google Drive/Yale
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">-P</code> here stands for <code class="language-plaintext highlighter-rouge">physical</code> (directory)</p>
<h2 id="useful-links"><a name="Link">Useful Links</a></h2>
<ul>
<li><a href="https://linode.com/docs/tools-reference/linux-users-and-groups/"><strong>File permission</strong> explained</a></li>
</ul>Li Zengli.zeng@yale.eduThe more I use bash the more I find it interesting. Basically every time I encounter a useful bash commands or when I learn something new about a command, I write them down for future reference.Git for version control2019-01-06T00:00:00+00:002019-01-06T00:00:00+00:00https://zenglix.github.io/git<p>Git is a great version control tool. It allows you to keep a history of your code, sync your project online and collaborate with others.</p>
<h3 id="resource">resource</h3>
<ul>
<li>Online git simulator and tutorial: <a href="http://learngitbranching.js.org/">git branching</a></li>
<li>git tutorial: <a href="https://www.atlassian.com/git/tutorials/">git tutorial</a></li>
<li><a href="http://marklodato.github.io/visual-git-guide/index-en.html">A visual git reference</a></li>
</ul>
<h2 id="basic">BASIC</h2>
<ul>
<li>
<p>configurations <br />
<code class="language-plaintext highlighter-rouge">git config --global user.name "your name"</code> <br />
<code class="language-plaintext highlighter-rouge">git config --global user.email "your.email"</code></p>
</li>
<li>
<p>initiate a git repository: <br />
<code class="language-plaintext highlighter-rouge">cd</code> into the directory where you want to initiate a git repository, and use <code class="language-plaintext highlighter-rouge">git init</code></p>
</li>
<li>
<p>chech status of current repository: <code class="language-plaintext highlighter-rouge">git status</code>. <br />
It reports which branch you are on, what files are changed, and which files are not tracked.</p>
</li>
<li>
<p>track a file: <br />
If you want to keep track of the changes in <code class="language-plaintext highlighter-rouge">file_A</code>, use <code class="language-plaintext highlighter-rouge">git add file_A</code>.</p>
</li>
<li>
<p>rename or move a file: <br />
<code class="language-plaintext highlighter-rouge">git mv old_file new_file</code></p>
</li>
<li>
<p>remove a file: <code class="language-plaintext highlighter-rouge">git rm filename</code> <br />
with this command the file is removed both in repository and in file system. <br />
If you just want to remove it from repository, but keep the file in system (simply untrack), use: <br />
<code class="language-plaintext highlighter-rouge">git rm --cached filename</code></p>
</li>
<li>
<p>check the changes in file with last staged version: <code class="language-plaintext highlighter-rouge">git diff filename</code> <br />
changes are marked with <code class="language-plaintext highlighter-rouge">+</code> or <code class="language-plaintext highlighter-rouge">-</code></p>
</li>
<li>
<p>show your commit history: <code class="language-plaintext highlighter-rouge">git log</code></p>
<ul>
<li><code class="language-plaintext highlighter-rouge">git log --graph</code> show commit tree</li>
</ul>
</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># show latest n commits</span>
git log <span class="nt">-n</span>
<span class="c"># simplified log</span>
git log <span class="nt">--oneline</span>
git log <span class="nt">--oneline</span> <span class="o">[</span><span class="nt">--decorate</span><span class="o">]</span>
</code></pre></div></div>
<ul>
<li>
<p>use <code class="language-plaintext highlighter-rouge">.gitignore</code> to specify files to be ignored <br />
Usually this kind of files are product files, including but not limited to:</p>
<ul>
<li>image files</li>
<li>pdfs</li>
<li>compiled code</li>
<li>system files</li>
</ul>
</li>
</ul>
<h2 id="commits">Commits</h2>
<ul>
<li>commit changes:</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> git commit <span class="nt">-a</span> <span class="nt">-m</span> <span class="s2">"commit message"</span> <span class="c">#commit all changes in repository </span>
git add changed_file1 changed_file2 <span class="c"># add files to staging area</span>
git commit <span class="nt">-m</span> <span class="s2">"commit message"</span> <span class="c"># commit changes in staging area</span>
</code></pre></div></div>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">git show</code>: can be used to show various objects <br />
eg: <code class="language-plaintext highlighter-rouge">git show HEAD</code> —– show most recent commit</p>
</li>
<li>
<p>change file back to its older version. <br />
<code class="language-plaintext highlighter-rouge">git checkout HEAD filename</code> <br />
change file back to its version in <code class="language-plaintext highlighter-rouge">HEAD</code>. Can replace head with other commits. Without specification of <code class="language-plaintext highlighter-rouge">filename</code>, the whole repository will be rewired.</p>
</li>
<li>
<p>move current branch to a specified commit (first 7 digits, shown in <code class="language-plaintext highlighter-rouge">git log</code>): <br />
<code class="language-plaintext highlighter-rouge">git reset commit_id</code></p>
<p>See a discussion of <code class="language-plaintext highlighter-rouge">git reset</code> and <code class="language-plaintext highlighter-rouge">git checkout</code>: <a href="http://stackoverflow.com/questions/3639342/whats-the-difference-between-git-reset-and-git-checkout">reset and checkout</a>.</p>
</li>
</ul>
<h4 id="topics">Topics</h4>
<p><strong>save new commits after <code class="language-plaintext highlighter-rouge">git checkout</code></strong> <br />
once you use <code class="language-plaintext highlighter-rouge">git checkout commit_id</code>, your head is detached, and you are not on any branch anymore (or you are on an anonymous branch). You can still stage and commit changes, but they won’t show up on branch master.</p>
<p>To be able to keep those changes, you can do:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># put current changes in a branch: new</span>
git checkout <span class="nt">-b</span> new
<span class="c"># merge new with master</span>
git checkout master
git merge new <span class="c"># may have to manually adjust conflicts</span>
</code></pre></div></div>
<p>see illustration <a href="http://marklodato.github.io/visual-git-guide/index-en.html#detached">save commits after git-checkout</a>.</p>
<h2 id="branching">Branching</h2>
<p>Branching of git is good for modulized work. Assume you have some intermediate product, and you are going to add another feature to it. It’s good practice to open up a new branch <code class="language-plaintext highlighter-rouge">new_feature</code>, work on that branch until your new program passed all kinds tests, and then you can safely merge the <code class="language-plaintext highlighter-rouge">new_feature</code> branch back to your <code class="language-plaintext highlighter-rouge">master branch</code>.</p>
<p>This way if anything horrible happens to your <code class="language-plaintext highlighter-rouge">new_feature</code> code, you can always return to your master branch without worrying that your intermediate product is also messed up.</p>
<ul>
<li>
<p>check which branch I’m currently on: <code class="language-plaintext highlighter-rouge">git branch</code></p>
</li>
<li>
<p>add a new branch: <code class="language-plaintext highlighter-rouge">git branch new_branch_name</code></p>
</li>
<li>
<p>switch branch: <code class="language-plaintext highlighter-rouge">git checkout branch_name</code></p>
</li>
<li>
<p>merge a new branch to master branch: <code class="language-plaintext highlighter-rouge">git merge branch_name</code></p>
</li>
<li>
<p>delete a branch: <code class="language-plaintext highlighter-rouge">git branch -d branch_name</code></p>
</li>
</ul>
<h2 id="remote">Remote</h2>
<p>With <code class="language-plaintext highlighter-rouge">git remote</code> you can freely transport your own git repository or copy others’ git repository.</p>
<ul>
<li>
<p>clone git from <code class="language-plaintext highlighter-rouge">remote_repository</code> and put under folder <code class="language-plaintext highlighter-rouge">clone_name</code>: <br />
<code class="language-plaintext highlighter-rouge">git clone remote_repository clone_name</code> <br />
will automatically give the remote repository a name <code class="language-plaintext highlighter-rouge">origin</code></p>
</li>
<li>
<p>show current remotes: <code class="language-plaintext highlighter-rouge">git remote -v</code></p>
</li>
<li>
<p>add a new remote under name <code class="language-plaintext highlighter-rouge">remote_name</code>: <br />
<code class="language-plaintext highlighter-rouge">git remote add remote_name url</code></p>
</li>
<li>
<p>rename a current remote: <br />
<code class="language-plaintext highlighter-rouge">git remote rename cur_name new_name</code></p>
</li>
<li>
<p>change url of a current remote: <br />
<code class="language-plaintext highlighter-rouge">git remote set-url remote_name new_url</code></p>
</li>
<li>
<p>get most recent update from remote (saved in <code class="language-plaintext highlighter-rouge">origin/master</code> branch) : <code class="language-plaintext highlighter-rouge">git fetch origin</code></p>
<p>If you want to move to the downloaded branch, use <code class="language-plaintext highlighter-rouge">git checkout -- track origin/master</code></p>
</li>
<li>
<p>push a branch in your repository to a remote repository: <br />
<code class="language-plaintext highlighter-rouge">git push remote_name branch_name</code> <br />
which pushes branch_name onto remote_name</p>
</li>
</ul>
<h2 id="collaboration">Collaboration</h2>
<p>Git can make life easier when you are doing collaboration work with other people. You may have a shared repository on Github, and each of your pull the repository to your local laptop, and work on different branches. Here’s how you contribute to the online repository:</p>
<ul>
<li>
<p>On master branch, <code class="language-plaintext highlighter-rouge">git fetch</code> and <code class="language-plaintext highlighter-rouge">git merge</code> (= <code class="language-plaintext highlighter-rouge">git pull</code>) changes from the remote</p>
</li>
<li>Develop the feature on <code class="language-plaintext highlighter-rouge">your_branch</code> and commit your work.
<ul>
<li>after your code is ready, you can push your branch (or take a git diff file) to the online repo for code review</li>
</ul>
</li>
<li>
<p>Switch back to master, <code class="language-plaintext highlighter-rouge">git pull</code> from the remote again (in case new commits were made while you were working)</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">git merge <your_branch></code> to add your code to repo</p>
</li>
<li>Push your branch up to the remote for review</li>
</ul>
<p>At step 3 you can have a problem: after you make changes, the remote repository may have changed (maybe your collaborator has updated it). So <code class="language-plaintext highlighter-rouge">git pull</code> would fail in this case, unless you resolve the fatal file differences. <code class="language-plaintext highlighter-rouge">git mergetool</code> becomes very handy for this purpose (<a href="https://stackoverflow.com/questions/161813/how-to-resolve-merge-conflicts-in-git">link</a>).</p>
<h2 id="others">Others</h2>
<ul>
<li><strong>Commit references</strong>
<ul>
<li><strong>Previous Movements</strong>
<ul>
<li><code class="language-plaintext highlighter-rouge">HEAD</code></li>
<li><code class="language-plaintext highlighter-rouge">HEAD@{3}</code>: where your head was 3 <strong>moves</strong> ago</li>
<li>head movements are listed in <code class="language-plaintext highlighter-rouge">git reflog</code></li>
</ul>
</li>
<li><strong>Parents</strong>
<ul>
<li><code class="language-plaintext highlighter-rouge">HEAD^</code> = <code class="language-plaintext highlighter-rouge">HEAD^1</code></li>
<li><code class="language-plaintext highlighter-rouge">HEAD^^</code>: first parent of first parent</li>
<li>can be used with any commit as well, eg <code class="language-plaintext highlighter-rouge"><commit_name>^2</code></li>
<li><code class="language-plaintext highlighter-rouge">***^2</code> only works with merge commits, where there are two parents</li>
</ul>
</li>
<li><strong>Ancestors</strong>
<ul>
<li><code class="language-plaintext highlighter-rouge">HEAD~</code>: first parent</li>
<li><code class="language-plaintext highlighter-rouge">HEAD~~</code>: first parent of first parent</li>
<li><code class="language-plaintext highlighter-rouge">HEAD~{5}</code>: ….</li>
</ul>
</li>
<li><strong>Double dot</strong>: range selection
<ul>
<li><code class="language-plaintext highlighter-rouge">master..your_branch</code></li>
<li><code class="language-plaintext highlighter-rouge">git log master..your_branch</code>: all commits reachable on <code class="language-plaintext highlighter-rouge">your_branch</code> but not on <code class="language-plaintext highlighter-rouge">master</code></li>
</ul>
</li>
</ul>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">git grep string</code></strong>: <br />
search for <code class="language-plaintext highlighter-rouge">string</code> in git repository. This is extremely useful when you want to change the name of certain functions or variables, as you can locate which files contain the things you wanna change.</p>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">git commit --amend</code></strong>: <br />
modify the most recent <code class="language-plaintext highlighter-rouge">git commit</code> message</li>
</ul>Li Zengli.zeng@yale.eduGit is a great version control tool. It allows you to keep a history of your code, sync your project online and collaborate with others.Python quick reference2018-12-25T00:00:00+00:002018-12-25T00:00:00+00:00https://zenglix.github.io/python<p><strong>Resources</strong></p>
<ul>
<li>for beginners without much programming background: <a href="https://learnpythonthehardway.org/book/">Learn python the hard way</a></li>
<li>a more systematic tutorial: <a href="https://www.tutorialspoint.com/python/index.htm">python: tutorialspoint</a></li>
<li>from R to Python: <a href="file:///Users/lizeng/Dropbox/Research/Z_Research%20notes/programming/python_for_R.pdf">python for R</a></li>
</ul>
<p><strong>Contents</strong></p>
<ol>
<li><a href="#SC">Special Characters</a></li>
<li><a href="#VO">Variables and Operators</a></li>
<li><a href="#FUNCTIONS">Functions</a></li>
<li><a href="#IO">Input/Output</a></li>
<li><a href="#CF">Control Flow</a></li>
<li><a href="#DS"><strong>Data Structures & Classes</strong></a></li>
<li><a href="#LIBRARY"><strong>Modules</strong></a></li>
<li><a href="#data">Python for data analysis</a></li>
<li><a href="#topics">Selected topics</a></li>
<li><a href="#other">Other</a></li>
</ol>
<h2 id="-special-characters"><a name="SC"></a> Special characters</h2>
<ul>
<li><code class="language-plaintext highlighter-rouge">#</code>: add comments after <code class="language-plaintext highlighter-rouge">#</code></li>
<li>common escape sequences: <br />
<code class="language-plaintext highlighter-rouge">\t, \n, \', \" ,\\ ...</code></li>
<li><code class="language-plaintext highlighter-rouge">\</code>: line continuation character</li>
<li><code class="language-plaintext highlighter-rouge">' '</code> and <code class="language-plaintext highlighter-rouge">" "</code> have the same effects</li>
<li>use <code class="language-plaintext highlighter-rouge">;</code> between statements to allow multiple statements in a single line</li>
<li><code class="language-plaintext highlighter-rouge">True, False, None</code></li>
</ul>
<h2 id="variables-and-operators"><a name="VO"></a>Variables and operators</h2>
<ul>
<li>
<p>basics:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">+</span><span class="p">,</span><span class="o">-</span><span class="p">,</span><span class="o">*</span><span class="p">,</span><span class="o">/</span> <span class="o">...</span> <span class="c1"># straight forward
</span>
<span class="o">%</span> <span class="c1"># modulo operator
</span> <span class="o">//</span> <span class="c1"># floor division: take floor() after division
</span> <span class="n">a</span><span class="o">**</span><span class="n">b</span> <span class="c1"># exponential: a^b
</span>
<span class="c1"># bitwise operator same as C
</span> <span class="n">A</span> <span class="o">|</span> <span class="n">B</span> <span class="c1">#Set union
</span> <span class="n">A</span> <span class="o">&</span> <span class="n">B</span> <span class="c1">#Set intersection
</span> <span class="n">A</span> <span class="o">&</span> <span class="o">~</span><span class="n">B</span> <span class="c1">#Set subtraction
</span> <span class="n">ALL_BITS</span> <span class="o">^</span> <span class="n">A</span> <span class="ow">or</span> <span class="o">~</span><span class="n">A</span> <span class="c1">#Set negation
</span> <span class="n">A</span> <span class="o">|=</span> <span class="mi">1</span> <span class="o"><<</span> <span class="n">bit</span> <span class="c1"># Set 1 to bit
</span> <span class="n">A</span> <span class="o">&=</span> <span class="o">~</span><span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="n">bit</span><span class="p">)</span> <span class="c1"># Clear bit
</span> <span class="p">(</span><span class="n">A</span> <span class="o">&</span> <span class="mi">1</span> <span class="o"><<</span> <span class="n">bit</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span> <span class="c1"># Test bit
</span> <span class="n">A</span><span class="o">&-</span><span class="n">A</span> <span class="ow">or</span> <span class="n">A</span><span class="o">&~</span><span class="p">(</span><span class="n">A</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="ow">or</span> <span class="n">x</span><span class="o">^</span><span class="p">(</span><span class="n">x</span><span class="o">&</span><span class="p">(</span><span class="n">x</span><span class="o">-</span><span class="mi">1</span><span class="p">))</span> <span class="c1">#Extract last bit
</span> <span class="n">A</span><span class="o">&</span><span class="p">(</span><span class="n">A</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="c1">#Remove last bit
</span> <span class="o">~</span><span class="mi">0</span> <span class="c1">#Get all 1-bits
</span></code></pre></div> </div>
<p>for a review of bitwise operations: <a href="https://discuss.leetcode.com/topic/50315/a-summary-how-to-use-bit-manipulation-to-solve-problems-easily-and-efficiently">bitwise operations review</a></p>
</li>
<li>Logical variables: <code class="language-plaintext highlighter-rouge">True, False</code></li>
<li><strong>Logical operators</strong>: <br />
<code class="language-plaintext highlighter-rouge">not(),A or B, C and D</code></li>
<li>check type of a variable: <br />
<code class="language-plaintext highlighter-rouge">print type(variable)</code> or <code class="language-plaintext highlighter-rouge">print type(variable).__name__</code></li>
<li>multiple assignment of variables: <br />
<code class="language-plaintext highlighter-rouge">a,b,c = 1,2,'power'</code></li>
<li><strong>standard data types</strong>:
<ul>
<li>numbers:
<code class="language-plaintext highlighter-rouge">var1 = 1; var2 = 2.0 # var1 int, var2 double</code> <br />
<code class="language-plaintext highlighter-rouge">del var1, var2 # remove variables</code></li>
<li>string</li>
<li>list</li>
<li>tuple</li>
<li>dictionary</li>
</ul>
</li>
<li>
<p>the <code class="language-plaintext highlighter-rouge">dir()</code> function:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">print</span> <span class="nb">dir</span><span class="p">()</span> <span class="c1"># all variables in the current space
</span> <span class="k">print</span> <span class="nb">dir</span><span class="p">(</span><span class="n">module_name</span><span class="p">)</span> <span class="c1"># all variable names in a module
</span></code></pre></div> </div>
</li>
<li>
<p>the unpack operator <code class="language-plaintext highlighter-rouge">*</code>:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">x</span><span class="o">=</span> <span class="p">[</span><span class="s">'A'</span><span class="p">,</span><span class="s">'B'</span><span class="p">,</span><span class="s">'C'</span><span class="p">]</span>
<span class="c1"># pass A,B,C as separate arguments
</span> <span class="n">fun</span><span class="p">(</span><span class="o">*</span><span class="n">x</span><span class="p">)</span>
</code></pre></div> </div>
</li>
<li>
<p>global/local variables:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">global</span> <span class="n">var</span> <span class="c1"># assign `var` to its global value in a local environment (eg. in a function)
</span> <span class="nb">globals</span><span class="p">()</span> <span class="c1"># gives a dict of global variables
</span> <span class="nb">locals</span><span class="p">()</span> <span class="c1"># gives a dict of local variables
</span></code></pre></div> </div>
</li>
</ul>
<h2 id="-functions"><a name="FUNCTIONS"></a> Functions</h2>
<ul>
<li>
<p>define a function:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># a simple function example
</span> <span class="k">def</span> <span class="nf">my_fun</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">):</span>
<span class="k">print</span> <span class="s">"sum of x and y is "</span><span class="p">,</span> <span class="n">x</span><span class="o">+</span><span class="n">y</span>
<span class="k">return</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">x</span><span class="o">+</span><span class="n">y</span><span class="p">)</span> <span class="c1"># or {'x':x,'z':x+y}
</span>
<span class="c1"># use the function
</span> <span class="n">my_fun</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">)</span>
<span class="n">my_fun</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span><span class="n">y</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
</code></pre></div> </div>
<p>Note that you can return multiple things separated by <code class="language-plaintext highlighter-rouge">,</code> as a ordered list of items. Let <code class="language-plaintext highlighter-rouge">out = my_fun(1,3)</code>. We can access values by <code class="language-plaintext highlighter-rouge">out[0]</code> (or <code class="language-plaintext highlighter-rouge">out['x']</code> in the commented case)</p>
</li>
<li>
<p>write a function with unknown number of arguments: <strong><code class="language-plaintext highlighter-rouge">*args</code> and <code class="language-plaintext highlighter-rouge">**kwargs</code></strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># args handles all unnamed arguments and put in a tuple
</span> <span class="c1"># kwargs handles all named arguments and put in a dictionary
</span> <span class="k">def</span> <span class="nf">myfun</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span><span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="n">args</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">kwargs</span><span class="p">)</span>
<span class="c1"># examples
</span> <span class="n">myfun</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="s">'x'</span><span class="p">)</span>
<span class="c1">#(1, 2, 'x')
</span> <span class="c1">#{}
</span> <span class="n">myfun</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span><span class="n">y</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span><span class="n">z</span><span class="o">=</span><span class="s">'x'</span><span class="p">)</span>
<span class="c1">#()
</span> <span class="c1">#{'x': 1, 'y': 2, 'z': 'x'}
</span></code></pre></div> </div>
</li>
<li>
<p><strong>Pass by reference vs value</strong> <br />
<strong>All python parameters are passed by reference</strong>, i.e. when a parameter is used in a function, a new reference to the same object is created and used in the statements in the function body.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># example function
</span> <span class="k">def</span> <span class="nf">change</span><span class="p">(</span><span class="n">mylist</span><span class="p">):</span>
<span class="n">mylist</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="s">'2'</span><span class="p">);</span> <span class="c1"># change the object
</span> <span class="n">mylist</span> <span class="o">+=</span> <span class="p">[</span><span class="s">'a'</span><span class="p">,</span><span class="s">'b'</span><span class="p">];</span> <span class="c1"># change the object
</span> <span class="n">mylist</span> <span class="o">=</span> <span class="n">mylist</span> <span class="o">+</span> <span class="p">[</span><span class="s">'c'</span><span class="p">,</span><span class="s">'d'</span><span class="p">];</span> <span class="c1"># does not change the object
</span> <span class="c1"># because at this line, a new object is created and mylist become a reference to that, while the old object remains
</span> <span class="k">return</span><span class="p">;</span>
</code></pre></div> </div>
</li>
<li>
<p>variable length arguments in functions</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># example function
</span> <span class="k">def</span> <span class="nf">eg</span><span class="p">(</span><span class="n">arg1</span><span class="p">,</span> <span class="o">*</span><span class="n">vartuple</span><span class="p">):</span>
<span class="k">print</span> <span class="n">arg1</span>
<span class="k">for</span> <span class="n">var</span> <span class="ow">in</span> <span class="n">vartuple</span>
<span class="k">print</span> <span class="n">var</span>
<span class="k">return</span>
<span class="c1"># the argument followed by * holds the remaining parameters
</span></code></pre></div> </div>
</li>
<li>
<p><strong>lambda functions</strong> (small anonymous functions)</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># lambda function example
</span> <span class="n">g</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="o">**</span><span class="mi">2</span>
<span class="n">f</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">:</span> <span class="n">x</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">y</span><span class="o">**</span><span class="mi">2</span>
<span class="k">print</span> <span class="n">g</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="k">print</span> <span class="n">g</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">)</span>
</code></pre></div> </div>
</li>
</ul>
<h2 id="-inputoutput"><a name="IO"></a> Input/Output</h2>
<h4 id="command-line-inputoutput">command line input/output</h4>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">print</code> examples:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># basics
</span> <span class="k">print</span> <span class="s">"hello world"</span>
<span class="k">print</span> <span class="s">"hello"</span><span class="p">,</span> <span class="s">"world"</span>
<span class="k">print</span> <span class="s">"Power"</span> <span class="o">+</span> <span class="s">"God"</span> <span class="c1"># concatenating, no space in between
</span> <span class="n">word1</span> <span class="o">=</span> <span class="s">"hello"</span><span class="p">;</span> <span class="n">word2</span> <span class="o">=</span> <span class="s">"world"</span>
<span class="k">print</span> <span class="s">"say: </span><span class="si">%</span><span class="s">s </span><span class="si">%</span><span class="s">s"</span> <span class="o">%</span> <span class="p">(</span><span class="n">word1</span><span class="p">,</span><span class="n">word2</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"say: </span><span class="si">%</span><span class="s">s"</span> <span class="o">%</span> <span class="n">word1</span>
<span class="k">print</span> <span class="n">word1</span><span class="p">,</span> <span class="n">word2</span>
<span class="c1"># more examples
</span> <span class="k">print</span> <span class="s">"God "</span> <span class="o">*</span><span class="mi">10</span> <span class="c1"># repeatedly printing
</span> <span class="k">print</span> <span class="s">"""
type in as long as you want
you can use multiple lines
"""</span> <span class="c1"># the use of tripple quotes
</span></code></pre></div> </div>
<ul>
<li>use <code class="language-plaintext highlighter-rouge">%s</code> for strings, <code class="language-plaintext highlighter-rouge">%d</code> for integers, <code class="language-plaintext highlighter-rouge">%f</code> for floating point. Use <code class="language-plaintext highlighter-rouge">%r</code> when debugging for raw version of variables</li>
<li>if add a <code class="language-plaintext highlighter-rouge">,</code> at the end of a <code class="language-plaintext highlighter-rouge">print</code> sentence, then it doesn’t end the line.</li>
</ul>
</li>
<li>
<p>taking inputs interactively:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">prompt</span> <span class="o">=</span> <span class="s">'>'</span> <span class="c1"># optional
</span> <span class="k">print</span> <span class="s">"tell me your age: "</span>
<span class="n">age</span> <span class="o">=</span> <span class="nb">raw_input</span><span class="p">(</span><span class="n">prompt</span><span class="p">)</span>
<span class="c1">#eg
</span> <span class="n">age</span> <span class="o">=</span> <span class="nb">raw_input</span><span class="p">(</span><span class="s">"How old are you?"</span><span class="p">)</span>
<span class="c1">#or
</span> <span class="kn">import</span> <span class="nn">sys</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">sys</span><span class="o">.</span><span class="n">stdin</span><span class="o">.</span><span class="n">read</span><span class="p">()</span> <span class="c1"># read from input
</span></code></pre></div> </div>
<ul>
<li>you need to use <code class="language-plaintext highlighter-rouge">print( )</code> in python3 or later versions</li>
</ul>
</li>
<li>
<p>passing command line input:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">from</span> <span class="nn">sys</span> <span class="kn">import</span> <span class="n">argv</span>
<span class="c1"># first argument is always the script
</span> <span class="n">script</span><span class="p">,</span> <span class="n">first</span><span class="p">,</span> <span class="n">second</span> <span class="o">=</span> <span class="n">argv</span>
<span class="c1"># first argument in arg1, the remaining in arg
</span> <span class="n">script</span><span class="p">,</span> <span class="n">arg1</span><span class="p">,</span> <span class="o">*</span><span class="n">arg</span> <span class="o">=</span> <span class="n">argv</span>
</code></pre></div> </div>
</li>
</ul>
<h4 id="file-inputoutput">file input/output</h4>
<ul>
<li>
<p>file objects: <br />
use <code class="language-plaintext highlighter-rouge">txt = open(filename)</code> where the <code class="language-plaintext highlighter-rouge">open</code> function returns a file object. There are also three modes, <code class="language-plaintext highlighter-rouge">'r'</code>(read), <code class="language-plaintext highlighter-rouge">'w'</code>(write), and <code class="language-plaintext highlighter-rouge">'a'</code>(append), for opening the file. With the file object, you can do:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">txt</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">txt</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s">"stuff"</span><span class="p">)</span>
<span class="n">txt</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span> <span class="c1"># read only one line
</span> <span class="n">txt</span><span class="o">.</span><span class="n">truncate</span><span class="p">()</span> <span class="c1"># erase the file
</span> <span class="n">txt</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> <span class="c1"># close the file
</span>
<span class="c1"># others
</span> <span class="n">txt</span><span class="o">.</span><span class="n">name</span> <span class="c1"># name of the file
</span> <span class="n">txt</span><span class="o">.</span><span class="n">closed</span> <span class="c1"># bool: whether the file is closed
</span> <span class="n">txt</span><span class="o">.</span><span class="n">mode</span> <span class="c1"># open mode
</span>
</code></pre></div> </div>
</li>
</ul>
<h2 id="-control-flow"><a name="CF"></a> Control Flow</h2>
<p>Indentation matters in the following codes.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># if sentence:
</span><span class="k">if</span> <span class="n">cond1</span><span class="p">:</span>
<span class="n">statement1</span>
<span class="k">elif</span> <span class="n">cond2</span><span class="p">:</span>
<span class="n">statement2</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">statement3</span>
<span class="c1"># conditional expression
</span><span class="n">x</span> <span class="o">=</span> <span class="n">val1</span> <span class="k">if</span> <span class="n">condition</span> <span class="k">else</span> <span class="n">val2</span> <span class="c1"># same as
</span><span class="k">if</span> <span class="n">condition</span><span class="p">:</span> <span class="n">x</span> <span class="o">=</span> <span class="n">val1</span>
<span class="k">else</span><span class="p">:</span> <span class="n">x</span> <span class="o">=</span> <span class="n">val2</span>
<span class="c1"># for loop:
</span><span class="k">for</span> <span class="n">iterator</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">:</span>
<span class="n">actions_in_each_iteration</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">after_condition_violation</span>
<span class="c1"># while loop:
</span><span class="k">while</span> <span class="p">(</span><span class="n">condition</span><span class="p">):</span>
<span class="n">actions_in_each_iteration</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">after_condition_violation</span>
<span class="c1"># loop control statements
</span><span class="k">break</span>
<span class="k">continue</span>
<span class="k">pass</span> <span class="c1"># null operation
</span></code></pre></div></div>
<p>Note that an <code class="language-plaintext highlighter-rouge">else:</code> statement can be added after each looping structure.</p>
<h2 id="-data-structures--classes"><a name="DS"></a> Data Structures & Classes</h2>
<h4 id="numbers">Numbers</h4>
<p><a href="https://www.tutorialspoint.com/python/python_numbers.htm">Tutorialspoint: Numbers overview</a></p>
<ul>
<li>type convertion functions: <br />
<code class="language-plaintext highlighter-rouge">int(x); long(x); float(x); complex(x)</code></li>
</ul>
<h4 id="strings">Strings</h4>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># initialization
</span><span class="n">mystring</span> <span class="o">=</span> <span class="s">"a test sentence"</span>
<span class="c1"># member functions
</span><span class="n">mystring</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">' '</span><span class="p">)</span> <span class="c1"># split by ' ' into a vector of strings
</span><span class="n">mystring</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span> <span class="c1"># remove the return at the end (if any)
</span><span class="n">mystring</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span> <span class="c1"># concatenate values in seq using mystring as separater
</span><span class="n">mystring</span><span class="o">.</span><span class="n">endswith</span><span class="p">(</span><span class="nb">str</span><span class="p">)</span> <span class="c1"># test the end of mystring
</span><span class="n">mystring</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="n">str1</span><span class="p">,</span> <span class="n">str2</span><span class="p">)</span> <span class="c1"># replace str1 with str2, return new string
</span><span class="n">mystring</span><span class="o">.</span><span class="n">isupper</span><span class="p">();</span> <span class="n">mystring</span><span class="o">.</span><span class="n">islower</span><span class="p">()</span> <span class="c1"># test upper/lower case
</span><span class="n">mystring</span><span class="o">.</span><span class="n">index</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span>
<span class="n">mystring</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="s">'c'</span><span class="p">)</span> <span class="c1"># get index and counts of a substring
</span><span class="n">mystring</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="s">"str"</span><span class="p">)</span> <span class="c1"># find substring; if not found return -1
</span>
<span class="c1"># others
</span><span class="nb">str</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">3</span><span class="p">]</span>
<span class="nb">str</span><span class="p">[</span><span class="mi">2</span><span class="p">:]</span> <span class="c1"># subsetting
</span><span class="nb">str</span><span class="o">*</span><span class="mi">2</span> <span class="c1"># string repitition
</span><span class="nb">str</span> <span class="o">+</span> <span class="s">"another string"</span> <span class="c1"># string concatenation
</span><span class="nb">str</span><span class="p">()</span> <span class="c1"># convert to string
</span><span class="nb">list</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="c1"># turn string into list of characters
</span>
<span class="nb">str</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">=</span><span class="s">'a'</span> <span class="c1">#DOES NOT WORK! strings do not support modification
</span><span class="nb">ord</span><span class="p">(</span><span class="s">'x'</span><span class="p">)</span> <span class="c1"># letter to ASCII code
</span><span class="nb">chr</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span> <span class="c1"># code to letter
</span></code></pre></div></div>
<h4 id="list">List</h4>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># initialization with []
</span><span class="n">mylist</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="s">'a'</span><span class="p">,(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">),[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">]]</span>
<span class="n">mylist</span> <span class="o">=</span> <span class="p">[</span><span class="s">'a'</span><span class="p">]</span> <span class="o">*</span> <span class="mi">10</span> <span class="c1"># initialize with length
</span>
<span class="c1"># member functions
</span><span class="n">mylist</span><span class="o">.</span><span class="n">append</span><span class="p">()</span> <span class="c1"># append to the end of list
</span><span class="n">mylist</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span> <span class="c1"># sort the list (in place)
</span><span class="n">mylist</span><span class="o">.</span><span class="n">pop</span><span class="p">()</span> <span class="c1"># pop the last item
</span><span class="n">mylist</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="c1"># pop the ith
</span><span class="n">mylist</span><span class="o">.</span><span class="n">reverse</span><span class="p">()</span>
<span class="n">mylist</span><span class="o">.</span><span class="n">insert</span><span class="p">(</span><span class="n">index</span><span class="p">,</span> <span class="n">obj</span><span class="p">)</span> <span class="c1"># insertion
</span><span class="n">mylist</span><span class="o">.</span><span class="n">remove</span><span class="p">(</span><span class="nb">object</span><span class="p">)</span> <span class="c1"># deletion
</span><span class="n">mylist</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="n">item</span><span class="p">)</span> <span class="c1"># count the frequency of item
</span><span class="n">mylist</span><span class="o">.</span><span class="n">index</span><span class="p">(</span><span class="n">val</span><span class="p">)</span> <span class="c1"># get index of a value
# ... etc
</span>
<span class="c1"># others
</span><span class="n">mylist</span><span class="p">[</span><span class="mi">2</span><span class="p">:]</span> <span class="c1"># subsetting
</span><span class="n">mylist</span><span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="p">:]</span>
<span class="n">mylist</span><span class="p">[:</span><span class="mi">2</span><span class="p">]</span>
<span class="n">mylist</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span>
<span class="n">mylist</span><span class="p">[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="c1"># reversing
</span><span class="n">mylist</span><span class="o">*</span><span class="mi">2</span> <span class="c1"># repetition
</span><span class="n">mylist</span> <span class="o">+</span> <span class="n">list2</span> <span class="c1"># list concatenation
</span><span class="k">del</span> <span class="n">mylist</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="c1"># item deletion
</span><span class="mi">3</span> <span class="ow">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span> <span class="c1"># membership test
</span><span class="nb">list</span><span class="p">(</span><span class="n">var</span><span class="p">)</span> <span class="c1"># convert to list
</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">b</span> <span class="o">=</span> <span class="n">c</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># cautious! a,b,c refer to the same empty list
</span></code></pre></div></div>
<p>Note:</p>
<ul>
<li><strong>list slicing results in a copy of list</strong></li>
</ul>
<h4 id="tuples">Tuples</h4>
<p>The main differences between lists and tuples are: Lists’ size can be changed, while tuples cannot be updated. <strong>Tuples can be thought of as read-only lists.</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># initialization by round parenthesis
</span><span class="n">mytuple</span> <span class="o">=</span> <span class="p">(</span><span class="s">'abc'</span><span class="p">,</span><span class="s">'def'</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">)</span>
<span class="c1"># member functions
</span><span class="n">mytuple</span><span class="o">.</span><span class="n">index</span><span class="p">(</span><span class="s">'abc'</span><span class="p">)</span> <span class="c1"># get index
</span><span class="n">mytuple</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="c1"># get frequency of element
</span>
<span class="c1"># others (similar to lists)
</span></code></pre></div></div>
<h4 id="dictionary">Dictionary</h4>
<p>Dictionary is a hash table like data structure in Python.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># initialization with curly braces
</span><span class="n">mydict</span> <span class="o">=</span> <span class="p">{}</span>
<span class="n">mydict</span> <span class="o">=</span> <span class="p">{</span><span class="s">'one'</span><span class="p">:</span> <span class="s">"power"</span><span class="p">,</span> <span class="s">'two'</span><span class="p">:</span> <span class="s">"god"</span><span class="p">}</span>
<span class="n">mydict</span><span class="p">[</span><span class="s">'one'</span><span class="p">]</span> <span class="o">=</span> <span class="s">"item 1"</span>
<span class="n">mydict</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="s">"item 2"</span> <span class="c1"># keys can be almost anything that is immutable
</span>
<span class="c1"># member functions
</span><span class="n">mydict</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span> <span class="c1"># get keys
</span><span class="n">mydict</span><span class="o">.</span><span class="n">values</span><span class="p">()</span> <span class="c1"># get values
</span><span class="n">mydict</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="n">default</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span> <span class="c1"># get value for key; if no key found, return default
</span><span class="n">mydict</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span> <span class="c1"># empty the mydict
</span><span class="n">mydict</span><span class="o">.</span><span class="n">items</span><span class="p">()</span> <span class="c1"># returns a list of tuple (key, value) pairs
</span><span class="n">mydict</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">mydict2</span><span class="p">)</span> <span class="c1"># add mydict2 elements to mydict; repeated keys in mydict2 will be discarded
</span><span class="n">mydict</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="c1"># remove an item
</span><span class="n">mydict</span><span class="o">.</span><span class="n">setdefault</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="n">default_value</span><span class="p">)</span> <span class="c1"># use default_value if key not found
</span></code></pre></div></div>
<h4 id="sets">Sets</h4>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># set initialization
</span><span class="n">s</span> <span class="o">=</span> <span class="nb">set</span><span class="p">([</span><span class="o">...</span><span class="p">])</span>
<span class="n">s</span> <span class="o">=</span> <span class="p">{</span><span class="mi">0</span><span class="p">}</span>
<span class="c1"># operations
</span><span class="n">s</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="s">'element'</span><span class="p">)</span> <span class="c1"># add elements
</span><span class="n">s</span><span class="o">.</span><span class="n">remove</span><span class="p">(</span><span class="s">"element"</span><span class="p">)</span>
<span class="n">s</span><span class="o">.</span><span class="n">union</span><span class="p">(</span><span class="n">s2</span><span class="p">)</span>
<span class="n">s</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="n">s2</span><span class="p">)</span>
<span class="c1"># set operators
</span><span class="o">|</span><span class="p">,</span> <span class="o">|=</span>
<span class="o">&</span><span class="p">,</span> <span class="o">&=</span>
<span class="o">-</span><span class="p">,</span> <span class="o">-=</span>
<span class="o">^</span><span class="p">,</span> <span class="o">^=</span>
<span class="o">>=</span><span class="p">,</span> <span class="o"><=</span>
</code></pre></div></div>
<h4 id="classes">CLASSES</h4>
<p>Python is an object-oriented language, thus it’s very easy to define classes and objects.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## create a class
</span><span class="k">class</span> <span class="nc">myclass</span><span class="p">:</span>
<span class="s">"this is the first class I create"</span>
<span class="n">name</span> <span class="o">=</span> <span class="s">""</span>
<span class="n">height</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">weight</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">bmi</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">__secret</span> <span class="o">=</span> <span class="s">"hidden variable"</span>
<span class="c1"># class initiation method
</span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="n">name</span><span class="p">,</span> <span class="n">height</span><span class="p">,</span> <span class="n">weight</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">name</span>
<span class="bp">self</span><span class="o">.</span><span class="n">height</span> <span class="o">=</span> <span class="n">height</span>
<span class="bp">self</span><span class="o">.</span><span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span>
<span class="bp">self</span><span class="o">.</span><span class="n">bmi</span> <span class="o">=</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">weight</span><span class="p">)</span><span class="o">/</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">height</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span>
<span class="c1"># printing method
</span> <span class="k">def</span> <span class="nf">printmyclass</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Name:"</span><span class="p">,</span><span class="bp">self</span><span class="o">.</span><span class="n">name</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Height:"</span><span class="p">,</span><span class="bp">self</span><span class="o">.</span><span class="n">height</span><span class="p">,</span><span class="s">"</span><span class="se">\n</span><span class="s">Weight:"</span><span class="p">,</span><span class="bp">self</span><span class="o">.</span><span class="n">weight</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Your bmi is calculated as: </span><span class="si">%.3</span><span class="s">f"</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">bmi</span><span class="p">)</span>
<span class="c1"># destroy this object:
</span> <span class="c1"># Python deletes built-in types or class instances automatically to free the memory space
</span> <span class="c1"># just to pop out a message
</span> <span class="k">def</span> <span class="nf">__del__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">__class__</span><span class="o">.</span><span class="n">__name__</span><span class="p">,</span> <span class="s">"destroyed"</span><span class="p">)</span>
<span class="c1">## use a class
</span><span class="n">me</span> <span class="o">=</span> <span class="n">myclass</span><span class="p">(</span><span class="s">"PowerGod"</span><span class="p">,</span><span class="mf">1.72</span><span class="p">,</span><span class="mi">68</span><span class="p">)</span>
<span class="n">me</span><span class="o">.</span><span class="n">printmyclass</span><span class="p">()</span>
<span class="k">print</span> <span class="n">me</span><span class="o">.</span><span class="n">bmi</span>
<span class="c1">## methods on a class object
</span><span class="k">print</span> <span class="p">(</span><span class="nb">hasattr</span><span class="p">(</span><span class="n">me</span><span class="p">,</span><span class="s">"height"</span><span class="p">))</span> <span class="c1"># test existence of an attribute
</span><span class="k">print</span> <span class="p">(</span><span class="nb">getattr</span><span class="p">(</span><span class="n">me</span><span class="p">,</span><span class="s">"height"</span><span class="p">))</span> <span class="c1"># get an attribute
</span><span class="nb">setattr</span><span class="p">(</span><span class="n">me</span><span class="p">,</span><span class="s">"height"</span><span class="p">,</span><span class="mf">1.78</span><span class="p">)</span> <span class="c1"># ...
</span><span class="nb">delattr</span><span class="p">(</span><span class="n">me</span><span class="p">,</span><span class="s">"weight"</span><span class="p">)</span>
<span class="n">me</span><span class="o">.</span><span class="n">printmyclass</span><span class="p">()</span>
<span class="c1">## selected built-in attributes
</span><span class="n">me</span><span class="o">.</span><span class="n">__doc__</span> <span class="c1"># documentation string
</span><span class="n">me</span><span class="o">.</span><span class="n">__dict__</span> <span class="c1"># dictionary of class's namespace
</span>
<span class="c1"># private variables
# name the variables you want to hide with double underscore prefix
</span><span class="k">print</span> <span class="p">(</span><span class="n">me</span><span class="o">.</span><span class="n">__secret</span><span class="p">)</span> <span class="c1"># will give error
</span></code></pre></div></div>
<p><strong>Class inheritance</strong>: assume that <code class="language-plaintext highlighter-rouge">myclass</code> is defined, then we can define a <code class="language-plaintext highlighter-rouge">child</code> subclass that inherit all its attributes:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">child</span><span class="p">(</span><span class="n">myclass</span><span class="p">):</span>
<span class="s">"""
this is a subclass of myclass
all myclass's attributes can be inherited
"""</span>
<span class="n">major</span><span class="o">=</span><span class="s">""</span>
<span class="n">year</span><span class="o">=</span><span class="mi">1993</span>
<span class="k">def</span> <span class="nf">printchild</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">name</span><span class="p">,</span><span class="s">" is going to graduate in "</span><span class="p">,</span><span class="bp">self</span><span class="o">.</span><span class="n">year</span><span class="p">)</span>
<span class="n">me2</span> <span class="o">=</span> <span class="n">child</span><span class="p">(</span><span class="s">"PowerGod"</span><span class="p">,</span><span class="mf">1.00</span><span class="p">,</span><span class="mf">45.00</span><span class="p">)</span> <span class="c1"># using parent class method
</span><span class="n">me2</span><span class="o">.</span><span class="n">printchild</span><span class="p">()</span> <span class="c1"># using subclass method
</span><span class="n">me2</span><span class="o">.</span><span class="n">printmyclass</span><span class="p">()</span> <span class="c1"># using parent class method
</span>
<span class="c1"># can also use super() to access parent class methods
# eg
</span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="n">__init__</span><span class="p">(</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">,</span><span class="n">c</span><span class="p">)</span>
</code></pre></div></div>
<p><strong>Notes:</strong></p>
<ul>
<li>
<p>Python also allows multiple inheritance</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">child</span><span class="p">(</span><span class="n">myclass1</span><span class="p">,</span> <span class="n">myclass2</span><span class="p">):</span>
<span class="k">pass</span>
</code></pre></div> </div>
<p><code class="language-plaintext highlighter-rouge">child</code> can use member methods from both parents. If there are shared method names, the method from <code class="language-plaintext highlighter-rouge">myclass1</code> would be used</p>
</li>
</ul>
<h2 id="python-environment-management">Python environment management</h2>
<p>Assume you have <em>Anaconda</em> installed.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># create a new environment
</span><span class="n">conda</span> <span class="n">create</span> <span class="o">--</span><span class="n">name</span> <span class="n">env_name</span> <span class="n">python</span><span class="o">=</span><span class="mf">3.5</span> <span class="o">...</span>
<span class="n">conda</span> <span class="n">create</span> <span class="o">--</span><span class="n">name</span> <span class="n">a_copy</span> <span class="n">python</span><span class="o">=</span><span class="mf">2.7</span> <span class="n">anaconda</span> <span class="c1"># with all current packages, but on different version of python
</span><span class="n">conda</span> <span class="n">create</span> <span class="o">--</span><span class="n">name</span> <span class="n">from_list</span> <span class="o">--</span><span class="nb">file</span> <span class="n">requirements</span><span class="o">.</span><span class="n">txt</span>
<span class="c1"># activate and deactivate
</span><span class="n">source</span> <span class="n">activate</span> <span class="n">env_name</span>
<span class="n">source</span> <span class="n">deactivate</span>
<span class="c1"># list
</span><span class="n">conda</span> <span class="n">search</span> <span class="n">pandas</span>
<span class="n">conda</span> <span class="nb">list</span> <span class="n">pandas</span>
<span class="n">conda</span> <span class="nb">list</span> <span class="o">--</span><span class="n">explicit</span> <span class="o">></span> <span class="n">requirements</span><span class="o">.</span><span class="n">txt</span> <span class="c1"># dump packages in env
</span>
<span class="c1"># install remove
</span><span class="n">conda</span> <span class="n">install</span> <span class="o"><</span><span class="n">package</span><span class="o">-</span><span class="n">name</span><span class="o">></span>
<span class="n">conda</span> <span class="n">remove</span> <span class="o">--</span><span class="n">name</span> <span class="o"><</span><span class="n">env_name</span><span class="o">></span> <span class="o">--</span><span class="nb">all</span> <span class="c1"># totally clear the environment
</span>
<span class="c1"># info
</span><span class="n">conda</span> <span class="n">info</span> <span class="o">--</span><span class="n">env</span>
</code></pre></div></div>
<h2 id="-packages-and-modules"><a name="LIBRARY"></a> Packages and Modules</h2>
<p><strong>Modules</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># a python source file can be used as a module
# import a python source file support.py
</span><span class="kn">import</span> <span class="nn">support</span> <span class="c1"># similar to source(support.R)
</span><span class="n">support</span><span class="o">.</span><span class="n">myfun</span><span class="p">()</span> <span class="c1"># use myfun() from support.py
</span>
<span class="c1"># import certain items from module
</span><span class="kn">from</span> <span class="nn">module_name</span> <span class="kn">import</span> <span class="n">item1</span><span class="p">,</span> <span class="n">item2</span>
<span class="kn">from</span> <span class="nn">support</span> <span class="kn">import</span> <span class="n">myfun</span> <span class="c1"># example
</span><span class="n">myfun</span><span class="p">()</span> <span class="c1"># this time no need to use module name in the front
</span>
<span class="c1"># import all names from the module
</span><span class="kn">from</span> <span class="nn">moduel_name</span> <span class="kn">import</span> <span class="o">*</span>
<span class="n">myfun</span><span class="p">()</span> <span class="c1"># no need to use module name in the front
</span>
</code></pre></div></div>
<p><strong>Packages</strong> <br />
A package is a hierachical directory with multiple modules. For instance, I have a folder <code class="language-plaintext highlighter-rouge">temp</code>, with source files <code class="language-plaintext highlighter-rouge">file1</code> and <code class="language-plaintext highlighter-rouge">file2</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># to make it into a package, add a __init__.py file in the temp folder, with commands
</span><span class="n">__all__</span> <span class="o">=</span> <span class="p">[</span><span class="s">"file1"</span><span class="p">,</span><span class="s">"file2"</span><span class="p">]</span>
<span class="kn">from</span> <span class="nn">temp.file1</span> <span class="kn">import</span> <span class="n">item1</span><span class="p">,</span> <span class="n">item2</span> <span class="o">...</span>
<span class="kn">from</span> <span class="nn">temp.file2</span> <span class="kn">import</span> <span class="n">item1</span><span class="p">,</span> <span class="n">item2</span> <span class="o">...</span>
<span class="c1"># now we can import this temp package
</span><span class="kn">import</span> <span class="nn">temp</span>
<span class="kn">from</span> <span class="nn">temp</span> <span class="kn">import</span> <span class="n">file1</span> <span class="c1"># or part of the package
</span><span class="n">file1</span><span class="o">.</span><span class="n">item1</span><span class="p">()</span>
<span class="c1"># to skip temp.
</span><span class="kn">from</span> <span class="nn">temp</span> <span class="kn">import</span> <span class="o">*</span>
<span class="n">file1</span><span class="o">.</span><span class="n">item1</span><span class="p">()</span>
<span class="n">file2</span><span class="o">.</span><span class="n">item1</span><span class="p">()</span>
</code></pre></div></div>
<ul>
<li>the <code class="language-plaintext highlighter-rouge">__all__</code> is used for <code class="language-plaintext highlighter-rouge">from temp import *</code>. Without it, <code class="language-plaintext highlighter-rouge">import *</code> would result in name collision (<code class="language-plaintext highlighter-rouge">item1</code> exists in both files). <code class="language-plaintext highlighter-rouge">__all__</code> resolves this by requiring to add submodule name in the front.</li>
<li>read more on organizing packages: <br />
<a href="https://docs.python.org/3/tutorial/modules.html#packages">python software fundation : modules</a></li>
</ul>
<h3 id="useful-modulespackages">Useful Modules/Packages</h3>
<ul>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">os</code></strong>: provide functions to take actions related to operating system</p>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">sys</code></strong>: system-specific parameters and functions
<ul>
<li><code class="language-plaintext highlighter-rouge">sys.path ...</code>: modify system searching path</li>
<li><code class="language-plaintext highlighter-rouge">sys.argv</code>: command line inputs</li>
<li><code class="language-plaintext highlighter-rouge">sys.exit(code)</code>: exit program with <code class="language-plaintext highlighter-rouge">code</code></li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">subprocess</code></strong>
<ul>
<li><code class="language-plaintext highlighter-rouge">subprocess.call(cmd, shell=True)</code>: run <code class="language-plaintext highlighter-rouge">cmd</code> as shell command</li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">re</code></strong> module for regular expressions: <br />
<a href="https://www.tutorialspoint.com/python/python_reg_expressions.htm">Python regular expression</a></li>
</ul>
<h2 id="-python-for-data-analysis"><a name="data"></a> Python for data analysis</h2>
<p>The built-in data structures of Python are not easy to use when performing data analysis. A couple of popular packages provide Python many R-like functionalities, and are perfect for data scientists.</p>
<ul>
<li><strong><code class="language-plaintext highlighter-rouge">NumPy</code></strong> for
<ul>
<li>math function evaluations (on arrays, list etc)</li>
<li>sampling from distribution (<code class="language-plaintext highlighter-rouge">numpy.random.***</code>)</li>
<li>array (matrix) manipulation</li>
<li>tutorial: <a href="https://docs.scipy.org/doc/numpy-dev/user/quickstart.html">NumPy quick start</a></li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">SciPy</code></strong> for scientific computing
<ul>
<li>linear algebra</li>
<li>numerical integration</li>
<li>optimization</li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">Pandas</code></strong> for
<ul>
<li><code class="language-plaintext highlighter-rouge">Series</code>: fixed-size dict</li>
<li><code class="language-plaintext highlighter-rouge">DataFrame</code>: most commonly used object</li>
<li><code class="language-plaintext highlighter-rouge">Pandas</code> objects are suitable for most <code class="language-plaintext highlighter-rouge">Numpy</code> functions</li>
<li>import (or write) files from different data structures:
<ul>
<li><code class="language-plaintext highlighter-rouge">Series.to_csv, Series.from_csv</code></li>
<li><code class="language-plaintext highlighter-rouge">DataFrame.to_csv, DataFrame.from_csv</code></li>
</ul>
</li>
<li>tutorial: <br />
<a href="http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro">pandas data structure</a> <br />
<a href="http://pandas.pydata.org/pandas-docs/stable/tutorials.html">pandas tutorial</a></li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">matplotlib</code></strong> for visualization
<ul>
<li>tutorial: <a href="https://matplotlib.org/users/pyplot_tutorial.html#controlling-line-properties">matplotlib tutorial</a></li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">sklearn</code></strong> for
<ul>
<li>machine learning tools</li>
</ul>
</li>
<li><strong><code class="language-plaintext highlighter-rouge">yaml</code></strong> for
<ul>
<li>read parameters from files:</li>
</ul>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="s">"""
The following is my parameter file:
par1: a
par2: [b,c]
par3: (d,e,f)
"""</span>
<span class="kn">import</span> <span class="nn">yaml</span>
<span class="n">txt</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_path</span><span class="p">)</span>
<span class="n">pars</span> <span class="o">=</span> <span class="n">yaml</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">txt</span><span class="p">)</span>
<span class="n">txt</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div> </div>
<p>Then the parameters are saved as elements in <code class="language-plaintext highlighter-rouge">pars</code> which is a <code class="language-plaintext highlighter-rouge">dict</code> type variable.</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">pickle</code></strong> for saving and restoring python work space:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">import</span> <span class="nn">pickle</span>
<span class="c1"># Save objects:
</span> <span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">"results.pckl"</span><span class="p">,</span><span class="s">"wb"</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">([</span><span class="n">obj1</span><span class="p">,</span><span class="n">obj2</span><span class="p">,</span><span class="n">obj3</span><span class="p">],</span><span class="n">f</span><span class="p">)</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># restore the objects:
</span> <span class="n">f</span><span class="o">=</span><span class="nb">open</span><span class="p">(</span><span class="s">"results.pckl"</span><span class="p">,</span><span class="s">"rb"</span><span class="p">)</span>
<span class="n">obj1</span><span class="p">,</span><span class="n">obj2</span><span class="p">,</span><span class="n">obj3</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div> </div>
</li>
<li><strong>data structure</strong> modules:
<ul>
<li><code class="language-plaintext highlighter-rouge">heapq</code> for implementations of heaps</li>
</ul>
</li>
</ul>
<h2 id="-selected-topics"><a name="topics"></a> Selected topics</h2>
<p><strong>Sorting</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># sort list using member function
# note: tuple, dict do not have .sort() member
</span><span class="n">ls</span> <span class="o">=</span><span class="p">[</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">,</span><span class="n">c</span><span class="p">]</span>
<span class="n">ls</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span> <span class="c1"># default sorting
</span><span class="n">ls</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="n">key</span> <span class="o">=</span> <span class="n">myfun</span><span class="p">)</span> <span class="c1"># customized sorting
</span>
<span class="c1"># sort using sorted():
# note this can be used to sort tuple, dict and other objects
</span><span class="nb">sorted</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">key</span> <span class="o">=</span> <span class="n">myfun</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="-others"><a name="other"></a> Others</h2>
<ul>
<li>
<p>get help: <br />
type <code class="language-plaintext highlighter-rouge">pydoc python_function</code> in terminal to get document for a python function</p>
</li>
<li>IDE recommendation
<ul>
<li><a href="https://www.continuum.io/downloads">Anaconda</a></li>
</ul>
</li>
<li>
<p>object ids (like address of an object in C): <br />
<code class="language-plaintext highlighter-rouge">id(var)</code>, unique identification number for each object</p>
</li>
<li><strong>output formatting</strong>
<ul>
<li><a href="https://pyformat.info/">link1</a> or <a href="https://www.python-course.eu/python3_formatted_output.php">link2</a></li>
<li>syntax for format placeholder <code class="language-plaintext highlighter-rouge">{:[flags][width][.precision]type}</code>
<ul>
<li>example: <code class="language-plaintext highlighter-rouge">{:10.2f}.format(27.1205)</code></li>
<li>types: <code class="language-plaintext highlighter-rouge">d, f, e, s</code></li>
<li>flags
<ul>
<li>left/right shift: <code class="language-plaintext highlighter-rouge"><, ></code></li>
<li>center: <code class="language-plaintext highlighter-rouge">^</code></li>
</ul>
</li>
<li>use variable: <code class="language-plaintext highlighter-rouge">"{v:10d}".format(v=10)</code></li>
</ul>
</li>
</ul>
</li>
<li>
<p>assertion: <br />
<code class="language-plaintext highlighter-rouge">assert expression, "error message"</code>: if expression result in <code class="language-plaintext highlighter-rouge">False</code>, an error is incurred, and the error message is printed</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">try-except</code> exception handling, and <code class="language-plaintext highlighter-rouge">raise</code> exceptions</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># basic syntax
</span> <span class="k">try</span><span class="p">:</span>
<span class="n">code</span> <span class="n">may</span> <span class="n">have</span> <span class="n">error</span>
<span class="k">except</span><span class="p">:</span>
<span class="n">what</span> <span class="n">to</span> <span class="n">do</span> <span class="n">when</span> <span class="n">error</span> <span class="n">exists</span>
<span class="c1"># more detailed
</span> <span class="k">try</span><span class="p">:</span>
<span class="n">your</span> <span class="n">operations</span> <span class="p">;</span>
<span class="k">except</span> <span class="n">ExceptionI</span><span class="p">,</span> <span class="n">ExceptionII</span><span class="p">:</span>
<span class="n">If</span> <span class="n">there</span> <span class="ow">is</span> <span class="n">ExceptionI</span><span class="p">,</span> <span class="n">ExceptionI</span> <span class="n">then</span> <span class="n">execute</span> <span class="n">this</span> <span class="n">block</span><span class="o">.</span>
<span class="k">except</span> <span class="n">ExceptionII</span><span class="p">:</span>
<span class="n">If</span> <span class="n">there</span> <span class="ow">is</span> <span class="n">ExceptionIII</span><span class="p">,</span> <span class="n">then</span> <span class="n">execute</span> <span class="n">this</span> <span class="n">block</span><span class="o">.</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">If</span> <span class="n">there</span> <span class="ow">is</span> <span class="n">no</span> <span class="n">exception</span> <span class="n">then</span> <span class="n">execute</span> <span class="n">this</span> <span class="n">block</span><span class="o">.</span>
<span class="k">finally</span><span class="p">:</span>
<span class="n">always</span> <span class="n">execute</span> <span class="n">this</span> <span class="n">block</span>
<span class="c1"># raise exceptions, eg:
</span> <span class="k">def</span> <span class="nf">raise_exception</span><span class="p">(</span><span class="n">s</span><span class="p">):</span>
<span class="k">if</span> <span class="n">s</span><span class="o"><</span><span class="mi">10</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">Errortype</span><span class="p">(</span><span class="s">"error message"</span><span class="p">)</span>
<span class="k">return</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">raise_exception</span><span class="p">(</span><span class="mi">20</span><span class="p">)</span>
<span class="k">except</span> <span class="n">Errortype</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">actions</span> <span class="n">on</span> <span class="n">this</span> <span class="n">exception</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
<span class="k">raise</span>
<span class="o">........</span>
<span class="c1"># define your own error type
</span> <span class="k">class</span> <span class="nc">my_err</span><span class="p">(</span><span class="nb">Exception</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="n">message</span><span class="p">):</span>
<span class="o">...</span>
<span class="c1"># usage
</span> <span class="k">try</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">my_err</span><span class="p">(</span><span class="s">"message"</span><span class="p">)</span>
<span class="k">except</span> <span class="n">my_err</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">actions</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</code></pre></div> </div>
<p>for a list of Exception types, see <a href="https://www.tutorialspoint.com/python/python_exceptions.htm">exception handling</a></p>
</li>
<li>
<p>reload module in interactive mode: <br />
when developing the package, you need to constantly modify and update the modules that have been imported. To re-import after modification, you need to use <code class="language-plaintext highlighter-rouge">importlib.reload(module.file)</code>.</p>
</li>
<li>
<p>Write python <strong>main function</strong>: <br />
Python does not have a <code class="language-plaintext highlighter-rouge">int main()</code> function as C/C++ do, but you can do the following to realize something similar to <code class="language-plaintext highlighter-rouge">main()</code>:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">import</span> <span class="nn">XXXXX</span>
<span class="c1"># you code here
</span> <span class="k">def</span> <span class="nf">myfun</span><span class="p">():</span>
<span class="c1"># ....
</span> <span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
<span class="c1"># when file directly sourced
</span> <span class="c1"># do something
</span> <span class="k">else</span><span class="p">:</span>
<span class="c1"># when file imported by other processes
</span> <span class="c1"># do soemthing else
</span></code></pre></div> </div>
<ul>
<li>variable <code class="language-plaintext highlighter-rouge">__name__</code> represent the name of <strong>current module</strong></li>
</ul>
</li>
<li>
<p>the <strong><code class="language-plaintext highlighter-rouge">with</code> keyword</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># syntax
</span> <span class="k">with</span> <span class="n">expression</span> <span class="p">[</span><span class="k">as</span> <span class="n">variable</span><span class="p">]:</span>
<span class="k">with</span><span class="o">-</span><span class="n">block</span>
<span class="c1"># example
</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'output.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s">'Hi there!'</span><span class="p">)</span>
<span class="s">"""
1. can automatically close file at the end of with-block
2. guarantees to close file no matter what exception raised in the block
"""</span>
</code></pre></div> </div>
</li>
</ul>Li Zengli.zeng@yale.eduResources美国PhD调查数据“分析”2018-05-25T00:00:00+00:002018-05-25T00:00:00+00:00https://zenglix.github.io/PhdData<p>前段时间忙着毕业。本来以为写完毕业论文就轻松了,仔细看了看学校毕业checklist 发现还有一大堆survey要填。其中有一个叫做 <a href="https://www.nsf.gov/statistics/2018/nsf18304/survey-description.cfm"><strong>Survey of Earned Doctorates</strong></a>,可以看作是对美帝PhD的普查了,需要先填完这个问卷,学校才会受理你的论文提交。问卷内容非常丰富,包含了各领域PhD的教育背景,家庭背景,未来去向等等。问卷官网上提供了一些官方report,但相比我们填写的内容,report的内容还不够丰富不够有意思。</p>
<p>好在部分统计数据也公开了出来,自己动手能挖掘不少有趣的内容。我有选择性的挑了一些PhD专业领域做了下“分析”(就画画图…)。如果你感兴趣的专业没有在图中找到,可以在我的<a href="https://github.com/zengliX/Notebooks/blob/master/Earned_phd_data.ipynb">Python Notebook</a> 的基础上对code稍作修改,得到你想要的figure。</p>
<p>2018,2017年的数据似乎还没有公开出来。所以以下结果仅仅是2016及其以前的。</p>
<h2 id="phd-毕业人数">PhD 毕业人数</h2>
<h3 id="总phd人数">总PhD人数</h3>
<p>下图是1957到2016美帝每年PhD总毕业人数。60年间毕业人数从10000涨到50000多。近十年保持了大概每年1000人的增幅。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/total_phd.png" width="70%" /></p>
<h3 id="各领域phd人数">各领域PhD人数</h3>
<p>那么哪些领域的PhD最多呢?以下是1986到2016年的数据。可以看到,从人数上来说BBS和Psychology and social science 独占鳌头,但这么比有点不公平,因为BBS只是Life Science下面的一个二级领域,但 Psychology and social science 是一个一级领域。所以还是BBS最厉害,最开始的基数就最大,增长也是最多的。不愧是生物学的世纪👍。CS的绝对人数比我想象的少了些,但是增幅是这里面最大的,从86年400人左右增加到了16年2000人。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/phd.fields.png" alt="" /></p>
<h3 id="phd最多的学校">PhD最多的学校</h3>
<p>2016年一共产生约54000新PhD,来自400多所不同高校/研究所。以下是PhD发的最多的40所大学,来看看能不能找到你的母校😎。我耶压线上榜。PhD最多的学校,反正我是没想到,大德州大学奥斯丁!毕竟是Amazon第二总部的有力候选城市。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/top_univ.png" alt="" /></p>
<h2 id="男女比例">男女比例</h2>
<p>男女比例应该是让很多理工PhD们尴尬的话题了。我们来看看这数据到底能多尴尬。</p>
<p>以下是1986-2016各领域男性PhD比例。所有专业的男生比例都在明显下滑,即便是本来女性就更多的Education方向,男性比例也显示了持续下滑的态势。所以不要说现在尴尬,看看30年前Engineer 95%的数据。。。。。又看了一眼Math&CS 80%左右的男生比,我大Biostat表示带不动,毕竟我系女生比例大概能超过90% 😎。</p>
<p>不知道是社会环境的宽松还是高校招生的diversity政策,更多的女同学们开始选择了追求更高的学位。不过如果单看近10年的男生比例的话,这个下降势头在很多学科明显放缓甚至没有了。不知道是否表示着各个学科的男女比快到达了一个均衡值?</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/male_ratio.png" alt="" /></p>
<h2 id="美帝的-international-phd-们">美帝的 International PhD 们</h2>
<h3 id="各学科-international-phd-比例变化-1986---2016">各学科 International PhD 比例变化 1986 - 2016</h3>
<p>先来看看各学科国际学生的比例都是多少。近十年大多专业比例变动不大。而CS方向近5年展现了明显的增长势头。美国人干生物医学,心理学和社会学的倒还挺多的。偏工程一些学科 CS, ME, EE 半壁江山都是国际学生。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/visa_ratio.png" alt="" /></p>
<h3 id="international-phd-十大来源国">International PhD 十大来源国</h3>
<p>再来看看top 10贡献国际PhD的国家。2016年数据,我大天朝以5000人,绝对的优势居首,所以平均每10个PhD里面就有一个从我国来的。其次是印度,也在意料之中。伊朗就很意外,竟是第四多的国家,而且近10年PhD增长了有5倍。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/visa_count.png" width="70%" /></p>
<h2 id="phd要读多久">PhD要读多久?</h2>
<p>下图表示从研究生入学(包括进phd项目前的master项目)到读下一个PhD要花多长时间。从下图可以看出来,读出来一个PhD所花的时间是越来越短的。所以说,现在5-6年读完一个PhD已经是很快了!人文艺术和教育方向的没有放在图中,因为放不下,大概都要9年+吧。。。。。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/years1.png" alt="" /></p>
<p>以下是2016年毕业生读完PhD用的时间。其中一个时间跟上图一样,是从研究生入学到拿到PhD的时间,另一个时间是从PhD program入学算起拿到PhD的时间。Education的两个时间差特别大,不知道是不是因为需要master+工作经历 才能申到PhD?还等专业人士解答。单看进PhD program之后需要读多久,好像除了 Humanity&Arts 大多专业都在5-6年之间。Engineer是平均毕业时间最短的,而居于次席的竟然是生命科学,强势秒杀了数学物理计算机😳 这是什么情况,生科不是普遍读很久的么。。。。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/years_2016.png" alt="" /></p>
<h2 id="phd就业去向待遇">PhD就业去向,待遇</h2>
<h3 id="各专业phd选择学术领域的比例">各专业PhD选择学术领域的比例</h3>
<p>下图显示了1996-2016 各专业PhD留守学术界的比例。可见不同专业在这项数据上差别还是相当大的。人文艺术的PhD几乎都走了教职路线,而相反Engineer的毕业生几乎都去了业界。就16和11年比较,几乎所有专业选择教职的都明显减少。像Math&CS方向更是在20年间都保持了下降态势。。。。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/academia.png" alt="" /></p>
<h3 id="各专业就业领域-base-salary-比较">各专业&就业领域 Base Salary 比较</h3>
<p>到了大家最关心的PhD毕业生的薪水问题。以下我列出了各专业去各领域的PhD们平均<strong>Base Salary</strong>。商学院毕业生们强势领先,连学界的薪资待遇都这么高,Econ和Math&CS紧随其后。不过这个Business系去Nonprofit的起薪高出这么一大截又是个什么情况😳。。。。。</p>
<p><img src="https://raw.githubusercontent.com/zengliX/Notebooks/master/Figures/PhDdata/empl_salary.png" alt="" /></p>
<p>时间有限,调查的一些细节也没有特别仔细阅读。如有解释错误的地方,欢迎大家指正🙏</p>Li Zengli.zeng@yale.edu前段时间忙着毕业。本来以为写完毕业论文就轻松了,仔细看了看学校毕业checklist 发现还有一大堆survey要填。其中有一个叫做 Survey of Earned Doctorates,可以看作是对美帝PhD的普查了,需要先填完这个问卷,学校才会受理你的论文提交。问卷内容非常丰富,包含了各领域PhD的教育背景,家庭背景,未来去向等等。问卷官网上提供了一些官方report,但相比我们填写的内容,report的内容还不够丰富不够有意思。Build your own website (with Jekyll and Minimal-mistakes theme)2017-08-17T00:00:00+00:002017-08-17T00:00:00+00:00https://zenglix.github.io/personal_website<p>This whole website is built using <a href="https://jekyllrb.com/"><strong>Jekyll</strong></a> and theme <a href="https://mmistakes.github.io/minimal-mistakes/"><strong>minimal-mistakes</strong></a>.
I had no experience of HTML or CSS language, but still managed to get the website to work. That’s just how easy it is to work with <strong>Jekyll</strong>.</p>
<p>I wouldn’t go through every step of the process in great detail as that would be too much text. But I’ll refer you to corresponding documents which will likely guide you through each of them.</p>
<h3 id="other-resource">Other resource:</h3>
<ul>
<li>Learning about Github Pages: <a href="http://jmcglone.com/guides/github-pages/">Github hosting</a></li>
</ul>
<h2 id="step-1-install-jekyll-and-minimal-mistakes-theme-and-generate-template-website">Step 1: install <em>Jekyll</em> and <em>minimal-mistakes</em> theme and generate template website</h2>
<h3 id="install-jekyll">install <em>Jekyll</em></h3>
<p>Type in terminal <code class="language-plaintext highlighter-rouge">gem install jekyll</code> for installation. <br />
This should work fine when you have all the dependencies installed. If you run into any issue, go to page <a href="https://jekyllrb.com/docs/installation/">jekyll install</a>.</p>
<h3 id="install-minimal-mistakes">install <em>minimal-mistakes</em></h3>
<p>It takes a little more work to install the theme. <br />
You need to first generate a jekyll folder: <code class="language-plaintext highlighter-rouge">jekyll new anyname</code>. Then a folder named <code class="language-plaintext highlighter-rouge">anyname</code> is going to be created in your current directory, and it contains all the basic files to make a website. <br />
Then follow the steps described in <a href="https://mmistakes.github.io/minimal-mistakes/docs/quick-start-guide/">install minimal-mistakes</a>. It basically involves adding some lines to some files in <code class="language-plaintext highlighter-rouge">anyname</code>, or replacing a file with their version on Github, and removing some files.<br />
After you run <code class="language-plaintext highlighter-rouge">bundle install</code>, all the dependencies of <strong>minimal-mistakes</strong> should be successfully installed.</p>
<p>Type <code class="language-plaintext highlighter-rouge">bundle exec jekyll serve</code>, and then a template website will be locally hosted at <a href="localhost:4000/">localhost:4000/</a>.</p>
<p>It should look something like this: <br />
<img src="/pics/website_tut/fresh.png" alt="fresh site" /></p>
<p>Now there are a lot of stuff you can do.
Like setting up site title on topleft corner, changing author name, biography, adding social sharings etc. You can find corresponding fields of all of these in <code class="language-plaintext highlighter-rouge">_config.yml</code> (refer to <a href="https://mmistakes.github.io/minimal-mistakes/docs/configuration/">configuration</a>).</p>
<h2 id="step-2-add-pages">Step 2: add pages</h2>
<p>Website pages are specified in file <code class="language-plaintext highlighter-rouge">_data/navigation.yml</code>:</p>
<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">main</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Quick-Start</span><span class="nv"> </span><span class="s">Guide"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">/docs/quick-start-guide/</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Posts"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">/year-archive/</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Categories"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">/categories/</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Tags"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">/tags/</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Pages"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">/page-archive/</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Collections"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">/collection-archive/</span>
<span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">External</span><span class="nv"> </span><span class="s">Link"</span>
<span class="na">url</span><span class="pi">:</span> <span class="s">https://google.com</span>
</code></pre></div></div>
<p>Each <code class="language-plaintext highlighter-rouge">-</code> corresponds to one page tab:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">title</code> is the page title</li>
<li><code class="language-plaintext highlighter-rouge">url</code> is the link to the file that contains contents of the page</li>
</ul>
<p>For example if you want to add a new tab <code class="language-plaintext highlighter-rouge">Blogs</code>, you can add the following two lines to <code class="language-plaintext highlighter-rouge">navigation.yml</code>:</p>
<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err"> </span><span class="pi">-</span> <span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Blogs"</span>
<span class="err"> </span> <span class="na">url</span><span class="pi">:</span> <span class="s">/Blogs/</span>
</code></pre></div></div>
<p>Then you make a new directory <code class="language-plaintext highlighter-rouge">/_pages/</code>, create inside a markdown file <code class="language-plaintext highlighter-rouge">blogs.md</code> containing:</p>
<div class="language-md highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Blogs"</span>
<span class="na">layout</span><span class="pi">:</span> <span class="s">archive</span>
<span class="na">permalink</span><span class="pi">:</span> <span class="s">/Blogs/</span>
<span class="na">author_profile</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">comments</span><span class="pi">:</span> <span class="no">true</span>
<span class="nn">---</span>
This is my blog page.
</code></pre></div></div>
<p>Be sure the <code class="language-plaintext highlighter-rouge">permalink:</code> matches the <code class="language-plaintext highlighter-rouge">url</code> in <code class="language-plaintext highlighter-rouge">navigation.yml</code> file.
Generate the website again, and look what’s new: <br />
<img src="/pics/website_tut/blog.png" alt="blogs" /></p>
<p>That’s basically how a new webpage tab is added.</p>
<h2 id="step-3-add-posts">Step 3: add posts</h2>
<p>Right now your Blog page is just a single page. Next, we are going to add posts to this page.</p>
<h3 id="posts">Posts</h3>
<p>Posts should be kept in <code class="language-plaintext highlighter-rouge">_posts</code> folder and named after <code class="language-plaintext highlighter-rouge">YEAR-MONTH-DAY-filename.md</code> so that <em>minimal-mistakes</em> can correctly identify them.</p>
<p>An example post markdown file:</p>
<div class="language-md highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">layout</span><span class="pi">:</span> <span class="s">single</span>
<span class="na">title</span><span class="pi">:</span> <span class="s2">"</span><span class="s">My</span><span class="nv"> </span><span class="s">first</span><span class="nv"> </span><span class="s">post"</span>
<span class="na">date</span><span class="pi">:</span> <span class="s">2016-11-11</span>
<span class="nn">---</span>
my first post looks just fine
</code></pre></div></div>
<p>where <code class="language-plaintext highlighter-rouge">layout:single</code> specifies this is a single page post; <code class="language-plaintext highlighter-rouge">title</code> would appear on top of page; <code class="language-plaintext highlighter-rouge">date</code> keeps time of “latest update”, and could be used to sort your post. There are also a bunch of other parameters you can specify: <a href="https://mmistakes.github.io/minimal-mistakes/docs/posts/">other parameters</a>.</p>
<p>Let’s create this toy post <code class="language-plaintext highlighter-rouge">toy.md</code> and put it in <code class="language-plaintext highlighter-rouge">_posts</code>.</p>
<h3 id="modify-page-file-to-include-posts">Modify page file to include posts</h3>
<p>In order to put <code class="language-plaintext highlighter-rouge">top.md</code> on you Blog page, you need to add commands in <code class="language-plaintext highlighter-rouge">blogs.md</code> to manually include it.</p>
<p>There are many ways to do this. I’ll just give one example: <br />
open the previous <code class="language-plaintext highlighter-rouge">blogs.md</code> file and add the following lines:</p>
<p><img src="/pics/website_tut/addToPage.png" alt="addToPage" /></p>
<p>This is basically <a href="http://shopify.github.io/liquid/">Liquid language</a>. You can adapt this block of code to get different display: sort by year, month or category etc.</p>
<p>Rename <code class="language-plaintext highlighter-rouge">toy.md</code> with prefix <code class="language-plaintext highlighter-rouge">YEAR-MONTH-DAY</code>. Then you can see the it on Blog page:</p>
<p><img src="/pics/website_tut/new_blog.png" alt="new blog" /></p>
<h2 id="step-4-use-github-for-hosting">Step 4: use Github for hosting</h2>
<p><a href="https://pages.github.com/">Github pages</a> is a great choice for free hosting. If you create a github repository named <code class="language-plaintext highlighter-rouge">username.github.io</code>, it will automatically convert the contents to a webpage at address <code class="language-plaintext highlighter-rouge">username.github.io</code>. However to make this work, it takes more than just uploading all the related local files to the repository.</p>
<p>The safest way is to copy the repository of <a href="https://github.com/mmistakes/minimal-mistakes">minimal mistakes</a>, and replace some files with your own version. Steps as below:</p>
<ul>
<li>Copy <strong>minmal mistakes</strong>:</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># create a new folder for this online version website</span>
<span class="nb">mkdir </span>GitPage
<span class="nb">cd </span>GitPage
<span class="c"># initiate git repository</span>
git init
<span class="c"># add minimal-mistakes repository to your remote</span>
git remote add minimal <span class="s2">"https://github.com/mmistakes/minimal-mistakes"</span>
<span class="c"># clone the repository to your folder</span>
git pull minimal master
</code></pre></div></div>
<ul>
<li>
<p>Customize the files: <br />
After you pull all the stuff, you get a version of <a href="https://mmistakes.github.io/minimal-mistakes/">minimal mistakes website</a> on your computer. We want to keep the framework, remove the contents and put our stuff in.</p>
<ul>
<li>You can safely delete folders <code class="language-plaintext highlighter-rouge">/docs</code> and <code class="language-plaintext highlighter-rouge">/test</code>.</li>
<li>replace <code class="language-plaintext highlighter-rouge">/_data/navigation.yml</code>, <code class="language-plaintext highlighter-rouge">/_data/ui-text.yml</code> and <code class="language-plaintext highlighter-rouge">/_config.yml</code> with your own version.</li>
<li>move your <code class="language-plaintext highlighter-rouge">_pages</code> and <code class="language-plaintext highlighter-rouge">_posts</code> folder to here</li>
</ul>
</li>
<li>
<p>Push your customized version online:
Create the <code class="language-plaintext highlighter-rouge">username.github.io</code> repository as mensioned previously.</p>
</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># add your github repository to remote</span>
git remote add origin <span class="s2">"ADDRESS.OF.YOUR.GITHUB.REPOSITORY"</span>
<span class="c"># track all files in GitPage folder and stage a commit</span>
git add <span class="nb">.</span>
git commit <span class="nt">-a</span> <span class="nt">-m</span> <span class="s2">"first commit"</span>
<span class="c"># push the folder online</span>
git push origin master
</code></pre></div></div>
<p>If every step goes well, your personal website should be alive at “username.github.io”!</p>
<p><img src="https://www.residentadvisor.net/images/labels/oh!yeah!.jpg" alt="ohyeah" /></p>
<h2 id="step-5-customize-website-style-to-be-updated">Step 5: customize website style (to be updated)</h2>
<p>Now the website is up and running, everything’s great. However, if you are eager to modify the website style to better fit your taste, it’s gonna cost a little more work.</p>Li Zengli.zeng@yale.eduThis whole website is built using Jekyll and theme minimal-mistakes. I had no experience of HTML or CSS language, but still managed to get the website to work. That’s just how easy it is to work with Jekyll.Introduction to Rcpp: making R much much faster2016-11-29T00:00:00+00:002016-11-29T00:00:00+00:00https://zenglix.github.io/Rcpp_basic<p>Pakcage <em><strong>Rcpp</strong></em> allows you to use <em>C++</em> or <em>C</em> code in an R environment. It’s a great tool to enhance speed of your program, at the price of longer programming and harder debugging. But when it finally works out, it’s totally worth it.</p>
<p>On <em>stackoverflow</em> (as of date 2016/9/22), number of <strong>r</strong> tagged questions is 153199, while number of <strong>rcpp</strong> tagged questions is 1193. Only 1% of the questions asked are about Rcpp. This implies the fact that not that many R users are also Rcpp users. The lack in population leads to incomplete documentation, and limited references you can find when you get into trouble during Rcpp programming.</p>
<p>The goal of this documentation is to give a general introduction to Rcpp, use it as a framework for future update with more details. We assume knowledge of both C++ and R programming, so there will be no introduction about them.</p>
<h3 id="collection-of-online-references">Collection of online references</h3>
<p>You might find the following web pages useful:</p>
<ul>
<li>Hadley Wickham’s Advanced R: <a href="http://adv-r.had.co.nz/Rcpp.html">Chapter from Advanced R</a></li>
<li>Online gitbook: <a href="https://www.gitbook.com/book/teuder/introduction-to-rcpp/details">Introduction to Rcpp</a></li>
<li>The <em>Armadillo</em> library details, with introduction about all member functions: <a href="http://arma.sourceforge.net/docs.html">Armadillo Website</a></li>
<li>Rcpp documentation: <a href="http://dirk.eddelbuettel.com/code/rcpp/html/index.html">Rcpp Version 0.12.7 Documentation</a></li>
<li>Understanding R’s C interface <a href="http://adv-r.had.co.nz/C-interface.html">C interface in R</a></li>
</ul>
<h2 id="two-ways-to-incorporate-c-functions">Two ways to incorporate C++ functions</h2>
<ul>
<li><strong>Inline function definition</strong>: usage of <code class="language-plaintext highlighter-rouge">cppFunction()</code></li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cppFunction</span><span class="p">(</span>
<span class="s">" int my_C_function (int x){</span><span class="err">
</span><span class="s"> int y=5;</span><span class="err">
</span><span class="s"> return x+y;</span><span class="err">
</span><span class="s">}"</span><span class="p">)</span>
</code></pre></div></div>
<ul>
<li><strong>Write .cpp source file</strong><br />
You can also write <em>.cpp</em> source files outside and use<br />
<code class="language-plaintext highlighter-rouge">sourceCpp("your_file_name.cpp")</code>
to source the file. However, there are certain rules to be followed. A simple template is shown below:</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include <Rcpp.h>
</span><span class="n">using</span> <span class="n">namespace</span> <span class="n">Rcpp</span><span class="p">;</span>
<span class="c1">//[[Rcpp::export]] /* to show that this function is to be exported to R */ </span>
<span class="cm">/* write your C++ function here */</span>
<span class="kt">int</span> <span class="nf">my_C_function</span> <span class="p">(</span><span class="kt">int</span> <span class="n">x</span><span class="p">){</span>
<span class="k">return</span> <span class="n">x</span><span class="o">+</span><span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="rcpp-data-structure">Rcpp Data Structure</h2>
<h3 id="numericvector">NumericVector</h3>
<ul>
<li>Basics:</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">NumericVector</span> <span class="nf">v</span> <span class="p">(</span><span class="mi">3</span><span class="p">);</span> <span class="c1">// rep(0,3)</span>
<span class="n">NumericVector</span> <span class="n">v</span> <span class="p">{</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">};</span>
<span class="n">NumericVector</span> <span class="nf">v</span> <span class="p">(</span><span class="mi">5</span><span class="p">,</span><span class="mi">3</span><span class="p">.</span><span class="mi">0</span><span class="p">);</span> <span class="c1">// rep(5,3)</span>
<span class="n">NumericVector</span> <span class="n">v</span> <span class="o">=</span> <span class="n">NumericVector</span><span class="o">::</span><span class="n">create</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">);</span>
<span class="c1">//</span>
<span class="c1">// subsetting</span>
<span class="n">v</span><span class="p">[</span><span class="n">u</span><span class="p">];</span> <span class="c1">// where u is a LogicalVector</span>
<span class="c1">//</span>
<span class="c1">//Use `clone()` function when you don't want your vector or matrix value to be changed:</span>
<span class="n">NumericVector</span> <span class="n">v1</span><span class="o">=</span><span class="n">v2</span><span class="p">;</span> <span class="c1">// change of v1 will result in change of v2</span>
<span class="n">NumericVector</span> <span class="n">v1</span><span class="o">=</span><span class="n">clone</span><span class="p">(</span><span class="n">v2</span><span class="p">);</span> <span class="c1">// v2 will not be changed, when v1 is changed</span>
<span class="c1">//</span>
<span class="c1">// member functions:</span>
<span class="n">v</span><span class="p">.</span><span class="n">length</span><span class="p">();</span> <span class="c1">// length of v</span>
<span class="c1">//</span>
<span class="c1">// Doing iterations: </span>
<span class="n">NumericVector</span><span class="o">::</span><span class="n">iterator</span> <span class="n">it</span><span class="p">;</span> <span class="c1">// 'it' is then a pointer to the head of the vector</span>
</code></pre></div></div>
<p>What you get from logical vector subsetting <code class="language-plaintext highlighter-rouge">v[u]</code> is a pointer. To be able to use it, you need to wrap it up into whatever type you want it to have (eg. <code class="language-plaintext highlighter-rouge">as<NumericVector>(v[u])</code>)</p>
<h3 id="numericmatrix">NumericMatrix</h3>
<ul>
<li>Basic</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">NumericMatrix</span> <span class="n">M</span><span class="p">;</span> <span class="c1">// multiple initiation methods as NumericVector</span>
<span class="n">M</span><span class="p">.</span><span class="n">length</span><span class="p">();</span> <span class="c1">// total elements of M</span>
<span class="n">M</span><span class="p">.</span><span class="n">nrow</span><span class="p">(),</span> <span class="n">M</span><span class="p">.</span><span class="n">ncol</span><span class="p">()</span> <span class="c1">// returns row , column number</span>
<span class="n">M</span><span class="p">.</span><span class="n">row</span><span class="p">(</span><span class="n">i</span><span class="p">),</span><span class="n">M</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">j</span><span class="p">)</span> <span class="c1">// returns pointers to row i, col j</span>
<span class="n">NumericVector</span> <span class="n">y</span><span class="o">=</span> <span class="n">M</span><span class="p">(</span> <span class="n">_</span> <span class="p">,</span> <span class="n">i</span><span class="p">);</span> <span class="c1">// get ith column, M(_,i) itself is also a pointer</span>
</code></pre></div></div>
<p>More operations of Matrix in <em><strong>RcppArmadillo</strong></em> Section.</p>
<h3 id="dataframe">DataFrame</h3>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">DataFrame</span> <span class="n">df</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="o">::</span><span class="n">create</span><span class="p">(</span><span class="n">Named</span><span class="p">(</span><span class="s">"a1"</span><span class="p">)</span><span class="o">=</span><span class="n">v1</span><span class="p">,</span> <span class="n">_</span><span class="p">[</span><span class="s">"a2"</span><span class="p">]</span> <span class="o">=</span><span class="n">v2</span><span class="p">);</span><span class="err">}</span> <span class="c1">// OK to do without names</span>
</code></pre></div></div>
<h3 id="list">List</h3>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// names can be added as well following same routine as in DataFrame</span>
<span class="n">List</span> <span class="n">L</span> <span class="o">=</span> <span class="n">List</span><span class="o">::</span><span class="n">create</span> <span class="p">(</span><span class="n">v1</span><span class="p">,</span><span class="n">v2</span><span class="p">);</span>
<span class="c1">// access elements by names</span>
<span class="kt">int</span> <span class="n">K</span> <span class="o">=</span> <span class="n">Mylist</span><span class="p">[</span><span class="s">"var_name"</span><span class="p">];</span>
</code></pre></div></div>
<h2 id="use-r-functions">Use R Functions</h2>
<p>Example:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//example 1: use R function</span>
<span class="n">Function</span> <span class="nf">dnorm</span><span class="p">(</span><span class="s">"dnorm"</span><span class="p">);</span>
<span class="kt">double</span> <span class="n">temp</span> <span class="o">=</span> <span class="n">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">Named</span><span class="p">(</span><span class="s">"mean"</span><span class="p">,</span><span class="mi">0</span><span class="p">),</span><span class="n">Named</span><span class="p">(</span><span class="s">"sd"</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span><span class="n">Named</span><span class="p">(</span><span class="s">"log"</span><span class="p">,</span><span class="mi">1</span><span class="p">));</span>
<span class="c1">//example 2: use function from global environment</span>
<span class="n">Environment</span> <span class="n">env</span><span class="o">=</span><span class="n">Environment</span><span class="o">::</span><span class="n">global_env</span><span class="p">();</span>
<span class="n">Function</span> <span class="nf">my_fun</span><span class="p">(</span><span class="s">"fun_in_glob"</span><span class="p">);</span> <span class="c1">// fun_in_glob() is a function defined existing in global env</span>
</code></pre></div></div>
<p>Transition from C++ to R takes a lot of time. Always try to find function supported by Rcpp or write your own function rather than refer functions from R package.</p>
<h2 id="linear-algebra-rcpparmadillo">Linear Algebra: <strong>RcppArmadillo</strong></h2>
<ul>
<li>
<p>Possible problems during installation and compilation: <a href="http://thecoatlessprofessor.com/programming/rcpp-rcpparmadillo-and-os-x-mavericks-lgfortran-and-lquadmath-error">-lgfortran and -lquadmath problem</a></p>
</li>
<li>When writing <em>RcppArmadillo</em> source files, use <code class="language-plaintext highlighter-rouge">#include <RcppArmadillo.h></code> then <code class="language-plaintext highlighter-rouge"><Rcpp.h></code> is spontaneously implied.</li>
<li>
<p>Include <code class="language-plaintext highlighter-rouge">using namespace arma;</code> to save the trouble of writing <code class="language-plaintext highlighter-rouge">arma::</code> everytime.</p>
</li>
<li>Basic variable types: <strong>arma::mat, arma::vec</strong></li>
</ul>
<h3 id="armamat">arma:mat</h3>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// initialization</span>
<span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="n">M</span><span class="p">;</span> <span class="c1">// initializes a 0 size matrix</span>
<span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="nf">M</span><span class="p">(</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">);</span> <span class="c1">// a by b matrix, filled with 0.</span>
<span class="c1">//</span>
<span class="c1">//member functions</span>
<span class="n">M</span><span class="p">.</span><span class="n">n_rows</span><span class="p">,</span> <span class="n">M</span><span class="p">.</span><span class="n">n_cols</span> <span class="c1">//number of rows and columns</span>
<span class="n">M</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="c1">// returns number of elements</span>
<span class="n">M</span><span class="p">.</span><span class="n">print</span><span class="p">()</span> <span class="c1">//print the matrix </span>
<span class="n">M</span><span class="p">.</span><span class="n">reshape</span><span class="p">(),</span> <span class="n">M</span><span class="p">.</span><span class="n">fill</span><span class="p">(),</span> <span class="n">M</span><span class="p">.</span><span class="n">ones</span><span class="p">(),</span> <span class="n">M</span><span class="p">.</span><span class="n">zeros</span><span class="p">()</span> <span class="c1">//</span>
<span class="n">M</span><span class="p">.</span><span class="n">t</span><span class="p">()</span><span class="c1">// transpose</span>
<span class="n">M</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">),</span> <span class="n">M</span><span class="p">.</span><span class="n">row</span><span class="p">(</span><span class="n">i</span><span class="p">),</span> <span class="n">M</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">j</span><span class="p">),</span> <span class="n">M</span><span class="p">.</span><span class="n">row</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">)</span> <span class="c1">// accessing elements</span>
<span class="c1">//</span>
<span class="c1">//operators for M</span>
<span class="n">M</span> <span class="o">%</span> <span class="n">M</span><span class="p">,</span> <span class="n">M</span> <span class="o">/</span> <span class="n">M</span> <span class="c1">// element wise multiplication, division</span>
<span class="n">inv</span><span class="p">(</span><span class="n">M</span><span class="p">)</span> <span class="c1">// inverse</span>
<span class="n">M</span><span class="o">*</span><span class="n">M</span> <span class="c1">//matrix product;</span>
<span class="c1">//</span>
<span class="c1">// Matrix subsetting</span>
<span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="n">M2</span> <span class="o">=</span> <span class="n">M</span><span class="p">.</span><span class="n">rows</span><span class="p">(</span><span class="n">from</span><span class="p">,</span> <span class="n">to</span><span class="p">);</span> <span class="c1">// contiguous; use M.cols() for column subsetting</span>
<span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="n">M3</span><span class="o">=</span> <span class="n">M</span><span class="p">.</span><span class="n">submat</span><span class="p">(</span><span class="n">row_from</span><span class="p">,</span> <span class="n">col_from</span><span class="p">,</span> <span class="n">row_to</span><span class="p">,</span> <span class="n">col_to</span><span class="p">);</span> <span class="c1">// contiguous ; by both row and column</span>
<span class="c1">// non-contiguous</span>
<span class="c1">// access multiple rows by indices</span>
<span class="c1">// index_vec need to be uvec (Col<uword>) or urowvec (Col<uword>) type</span>
<span class="n">M</span><span class="p">.</span><span class="n">cols</span><span class="p">(</span><span class="n">index_vec</span><span class="p">),</span> <span class="n">M</span><span class="p">.</span><span class="n">rows</span><span class="p">(</span><span class="n">index_vec</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="armavec">arma::vec</h3>
<p><strong>arma::vec</strong> is also treated as <strong>arma::mat</strong> with only one column.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1">// basics</span>
<span class="n">arma</span><span class="o">::</span><span class="n">vec</span> <span class="n">V</span><span class="p">;</span>
<span class="n">V</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="c1">// returns length of V</span>
<span class="c1">//</span>
<span class="c1">// vector subsetting</span>
<span class="n">v</span><span class="p">.</span><span class="n">subvec</span><span class="p">(</span> <span class="n">from</span><span class="p">,</span> <span class="n">to</span><span class="p">);</span> <span class="c1">// contiguous subsetting; from, to are index</span>
<span class="c1">// non-contiguous</span>
</code></pre></div></div>
<h3 id="cube">Cube</h3>
<p>Cube is three dimensional array. Less often used than <strong>arma::mat, arma::vec</strong>, but also useful.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//construtors</span>
<span class="n">arma</span><span class="o">::</span><span class="n">cube</span> <span class="nf">x</span><span class="p">(</span><span class="n">n_row</span><span class="p">,</span> <span class="n">n_col</span><span class="p">,</span> <span class="n">n_slice</span><span class="p">);</span> <span class="c1">// all 0</span>
<span class="c1">//</span>
<span class="c1">// attributes</span>
<span class="n">x</span><span class="p">.</span><span class="n">n_cols</span><span class="p">,</span> <span class="n">x</span><span class="p">.</span><span class="n">n_rows</span><span class="p">,</span> <span class="n">x</span><span class="p">.</span><span class="n">n_slices</span> <span class="c1">// number of dimensions</span>
<span class="n">x</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="c1">// number of elements</span>
<span class="c1">//</span>
<span class="c1">// member</span>
<span class="n">x</span><span class="p">.</span><span class="n">slice</span><span class="p">(</span><span class="n">i</span><span class="p">);</span> <span class="c1">// mat of slice i</span>
<span class="n">x</span><span class="p">.</span><span class="n">slices</span><span class="p">(</span><span class="n">first_slice</span><span class="p">,</span> <span class="n">last_slice</span><span class="p">);</span> <span class="c1">// contiguous slices</span>
<span class="n">x</span><span class="p">.</span><span class="n">subcube</span><span class="p">(</span><span class="n">row1</span><span class="p">,</span><span class="n">col1</span><span class="p">,</span><span class="n">slice1</span><span class="p">,</span><span class="n">row2</span><span class="p">,</span><span class="n">col2</span><span class="p">,</span><span class="n">slice2</span><span class="p">);</span> <span class="c1">// contiguous subcube</span>
<span class="n">x</span><span class="p">.</span><span class="n">fill</span><span class="p">(</span><span class="kt">double</span> <span class="n">c</span><span class="p">);</span> <span class="c1">// fill the cube with c</span>
</code></pre></div></div>
<h3 id="shared-functions"><em>shared functions</em></h3>
<p>This section, I put in some useful functions mostly shared by both <em>arma::mat</em> and <em>arma::vec</em>, and some by <em>arma::cube</em>.</p>
<ul>
<li><strong>Element-wise functions</strong>: <a href="http://arma.sourceforge.net/docs.html#misc_fns">element-wise</a>.</li>
<li><strong>Constructors</strong>: <a href="http://arma.sourceforge.net/docs.html#constructors_mat">mat constructor</a> and <a href="http://arma.sourceforge.net/docs.html#adv_constructors_mat">mat advanced constructor</a>.</li>
<li>others:</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1">// iterators</span>
<span class="n">arma</span><span class="o">::</span><span class="n">vec</span><span class="o">::</span><span class="n">iterator</span> <span class="n">it</span><span class="p">;</span> <span class="c1">// arma::vec::const_iterator for read only</span>
<span class="n">v</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span> <span class="n">v</span><span class="p">.</span><span class="n">end</span><span class="p">()</span> <span class="c1">// for vector</span>
<span class="n">v</span><span class="p">.</span><span class="n">begin_row</span><span class="p">(</span><span class="n">row_number</span><span class="p">),</span> <span class="n">v</span><span class="p">.</span><span class="n">end_row</span><span class="p">(</span><span class="n">row_number</span><span class="p">)</span> <span class="c1">// for mat; column version similar </span>
<span class="c1">//</span>
<span class="n">diagmat</span><span class="p">(</span> <span class="n">M</span> <span class="p">)</span> <span class="c1">// generate diagonal matrix from given matrix or vector</span>
<span class="n">accu</span><span class="p">(</span><span class="n">M</span><span class="p">)</span> <span class="c1">// accumulate sum of all elements in vector or matrix</span>
<span class="c1">//</span>
<span class="c1">//elements access</span>
<span class="n">V</span><span class="p">.</span><span class="n">at</span><span class="p">(</span><span class="n">i</span><span class="p">),</span> <span class="n">V</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="c1">// element i, for vector</span>
<span class="n">M</span><span class="p">.</span><span class="n">at</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">),</span> <span class="n">M</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">)</span> <span class="c1">// for matrix</span>
<span class="c1">//</span>
<span class="c1">// initialization</span>
<span class="n">ones</span><span class="p">(</span><span class="n">n_elem</span><span class="p">),</span> <span class="n">ones</span><span class="p">(</span><span class="n">n_rows</span><span class="p">,</span><span class="n">n_cols</span><span class="p">)</span> <span class="c1">// matrix filled with 1</span>
<span class="n">ones</span><span class="o"><</span><span class="n">vec_type</span><span class="o">></span><span class="p">(</span><span class="n">n_elem</span><span class="p">);</span> <span class="n">ones</span><span class="o"><</span><span class="n">mat_type</span><span class="o">></span><span class="p">(</span><span class="n">dim1</span><span class="p">,</span> <span class="n">dim2</span><span class="p">)</span>
<span class="n">randu</span><span class="o"><</span><span class="n">type</span><span class="o">></span><span class="p">(</span><span class="n">dim1</span><span class="p">,</span> <span class="n">dim2</span><span class="p">,</span> <span class="n">dim3</span><span class="p">);</span> <span class="c1">//unif(0,1); type can be : vec, mat, cube</span>
<span class="n">randu</span><span class="o"><</span><span class="n">type</span><span class="o">></span><span class="p">(</span><span class="n">dim1</span><span class="p">,</span> <span class="n">dim2</span><span class="p">,</span> <span class="n">dim3</span><span class="p">);</span> <span class="c1">// N(0,1)</span>
<span class="n">zeros</span><span class="o"><</span><span class="n">vector_type</span><span class="o">/</span><span class="n">mat_type</span><span class="o">/</span><span class="n">cube_type</span><span class="o">></span><span class="p">(...);</span> <span class="c1">// initiation with 0s</span>
<span class="c1">//others</span>
<span class="p">.</span><span class="n">min</span><span class="p">();.</span><span class="n">max</span><span class="p">();</span> <span class="c1">// get minimum maximum</span>
</code></pre></div></div>
<ul>
<li>
<p>Type conversion: <br />
say you have an input of type <em>NumericMatrix x</em>, you can convert it with: <br />
<code class="language-plaintext highlighter-rouge">arma::mat y= as<arma::mat>(x);</code></p>
<p>To work in the opposite direction use <em>wrap</em> function :<br />
<code class="language-plaintext highlighter-rouge">NumericVector x= wrap(y);</code></p>
</li>
</ul>
<h3 id="useful-topics">useful topics</h3>
<ul>
<li>use logical vector to access submatrix/subvector:</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="nf">matrix_sub</span><span class="p">(</span><span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="n">M</span><span class="p">,</span> <span class="n">LogicalVector</span> <span class="n">a</span><span class="p">,</span> <span class="kt">int</span> <span class="n">b</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// b=1: select row</span>
<span class="c1">// b=2: select column</span>
<span class="n">arma</span><span class="o">::</span><span class="n">mat</span> <span class="n">out</span><span class="p">;</span>
<span class="k">if</span><span class="p">(</span><span class="n">b</span><span class="o">==</span><span class="mi">2</span><span class="p">){</span>
<span class="n">arma</span><span class="o">::</span><span class="n">colvec</span> <span class="n">z</span><span class="o">=</span><span class="n">as</span><span class="o"><</span><span class="n">arma</span><span class="o">::</span><span class="n">colvec</span><span class="o">></span><span class="p">(</span><span class="n">a</span><span class="p">);</span>
<span class="n">out</span><span class="o">=</span><span class="n">M</span><span class="p">.</span><span class="n">cols</span><span class="p">(</span><span class="n">find</span><span class="p">(</span><span class="n">z</span><span class="o">==</span><span class="mi">1</span><span class="p">));</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span><span class="p">(</span><span class="n">b</span><span class="o">==</span><span class="mi">1</span><span class="p">){</span>
<span class="n">arma</span><span class="o">::</span><span class="n">rowvec</span> <span class="n">z</span><span class="o">=</span><span class="n">as</span><span class="o"><</span><span class="n">arma</span><span class="o">::</span><span class="n">rowvec</span><span class="o">></span><span class="p">(</span><span class="n">a</span><span class="p">);</span>
<span class="n">out</span><span class="o">=</span><span class="n">M</span><span class="p">.</span><span class="n">rows</span><span class="p">(</span><span class="n">find</span><span class="p">(</span><span class="n">z</span><span class="o">==</span><span class="mi">1</span><span class="p">));</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">out</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>We first convert the logical vector <code class="language-plaintext highlighter-rouge">a</code> into <code class="language-plaintext highlighter-rouge">colvec</code> or <code class="language-plaintext highlighter-rouge">rowvec</code>, on which we can use the <code class="language-plaintext highlighter-rouge">find(expr)</code> function. <code class="language-plaintext highlighter-rouge">find</code> return the index (type <code class="language-plaintext highlighter-rouge">uvec</code>) where <code class="language-plaintext highlighter-rouge">expr</code> is true, and that index can be used to get submatrix.</p>
<p>For vector, the steps can be easier:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// convert logical vector to uvec</span>
<span class="n">arma</span><span class="o">::</span><span class="n">uvec</span> <span class="n">q</span> <span class="o">=</span> <span class="n">as</span><span class="o"><</span><span class="n">arma</span><span class="o">::</span><span class="n">uvec</span><span class="o">></span><span class="p">(</span><span class="n">a</span><span class="p">);</span>
<span class="c1">// use .elem() function to get subvector</span>
<span class="k">return</span> <span class="n">v</span><span class="p">.</span><span class="n">elem</span><span class="p">(</span><span class="n">find</span><span class="p">(</span><span class="n">q</span><span class="p">));</span>
</code></pre></div></div>
<p>More on how to use find: <a href="http://arma.sourceforge.net/docs.html#find">find</a>.</p>
<h2 id="work-with-distributions">Work with Distributions</h2>
<p><em>Rcpp</em> provides many equivalents for R functions related to distributions, so you don’t have to scratch your head to write your own, or refer to those R functions with the price of a speed slow down.</p>
<h5 id="uniform-distribution">Uniform distribution</h5>
<ul>
<li><code class="language-plaintext highlighter-rouge">R::runif(double a, double b)</code> : uniform from <code class="language-plaintext highlighter-rouge">[a,b]</code></li>
</ul>
<h5 id="binomial-distribution">Binomial distribution</h5>
<ul>
<li><code class="language-plaintext highlighter-rouge">R::dbinom(x, size, prob, log=0\1)</code>:expects 4 inputs<br />
<code class="language-plaintext highlighter-rouge">R::qbinom(p,size,prob,lower.tail,log.p)</code>: expects 5 inputs<br />
<code class="language-plaintext highlighter-rouge">R::rbinom(size,p)</code>: only generates one random value at a time; need to vectorize it if necessary<br />
same parameters as in <em>R</em>. For parameter <em>log</em>, use <code class="language-plaintext highlighter-rouge">0/1</code> instead of <code class="language-plaintext highlighter-rouge">true/false</code>.</li>
</ul>
<h5 id="poisson-distribution">Poisson distribution</h5>
<h5 id="beta-distribution">Beta distribution</h5>
<ul>
<li><code class="language-plaintext highlighter-rouge">R::dbeta(double x, double a, double b, int log)</code></li>
</ul>
<h5 id="gamma-distribution">Gamma distribution</h5>
<ul>
<li><code class="language-plaintext highlighter-rouge">R::rgamma(double shape, double scale)</code> : it only takes scale rather than rate as input. There is also a vectorized version:
<code class="language-plaintext highlighter-rouge">Rcpp::rgamma(int n, double shape, double scale)</code><br />
<code class="language-plaintext highlighter-rouge">R::dgamma(double x, double shape, double scale, int logical)</code> : the 4th parameter control if output should be <em>log</em> transformed.</li>
</ul>
<h5 id="exponential-distribution">Exponential distribution</h5>
<ul>
<li><code class="language-plaintext highlighter-rouge">R::rexp( double r )</code> : for generating one exponential random variable
<code class="language-plaintext highlighter-rouge">Rcpp::rexp( int n, double r)</code> : for generating an array of exponential numbers</li>
</ul>
<h2 id="frequently-used-functions">Frequently used functions</h2>
<ul>
<li>
<p>Type convertion:<br />
<code class="language-plaintext highlighter-rouge">wrap()</code> : a templated function that transforms an arbitrary object into a <em>SEXP</em>, that can be returned to R. <br />
eg:<br />
<code class="language-plaintext highlighter-rouge">NumericVector x= wrap(seq(1,n))</code></p>
</li>
<li>
<p>Console output:</p>
</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Console output:</span>
<span class="n">Rcout</span> <span class="o"><<</span> <span class="s">"Some message"</span> <span class="o"><<</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="n">Rcerr</span> <span class="o"><<</span> <span class="s">"Error message"</span> <span class="p">;</span>
</code></pre></div></div>
<ul>
<li>R <em>any()</em> equivalent:</li>
</ul>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">bool</span> <span class="nf">any_cpp</span><span class="p">(</span><span class="n">LogicalVector</span> <span class="n">lv</span><span class="p">)</span>
<span class="p">{</span><span class="k">return</span> <span class="n">is_true</span><span class="p">(</span><span class="n">any</span><span class="p">(</span><span class="n">lv</span><span class="p">));}</span>
</code></pre></div></div>
<ul>
<li>
<p>R <em>seq()</em> equivalent:<br />
<code class="language-plaintext highlighter-rouge">seq(int start,int end)</code> , it’s the same as R <code class="language-plaintext highlighter-rouge">seq( , ,by=1)</code>. The return type is <em>Rcpp::Range</em>, need to use <code class="language-plaintext highlighter-rouge">wrap()</code> function to make it a NumericVector.</p>
</li>
<li>
<p>R <em>sample()</em>:<br />
For simple cases, we can adapt from <code class="language-plaintext highlighter-rouge">R::runif()</code> to achieve our goal. <br />
For example, when we want to sample one integer from <code class="language-plaintext highlighter-rouge">c(a:b)</code>, we can do <code class="language-plaintext highlighter-rouge">int out=R::runif(a,b+1)</code>.</p>
</li>
</ul>
<p>There is an equivalent <code class="language-plaintext highlighter-rouge">sample</code> function in <code class="language-plaintext highlighter-rouge"><sample.h></code> file. To use it, we need to first <code class="language-plaintext highlighter-rouge">#include <RcppArmadilloExtensions/sample.h></code> and follow the syntax: <br />
<code class="language-plaintext highlighter-rouge">Rcpp::RcppArmadillo::sample(sample_set,int size, bool replacement, weight_vec)</code>.</p>
<p>See <a href="http://stackoverflow.com/questions/26384959/rcpp-r-sample-equivalent-from-a-numericvector">stackoverflow:sample</a>.</p>
<ul>
<li><em>max()</em> equivalent:<br />
<code class="language-plaintext highlighter-rouge">max( obj )</code> : obj can be <em>NumericVector</em></li>
</ul>
<h2 id="others">Others</h2>
<ul>
<li>To use C++11 features, such as <em>range based for</em>, and more ways of variable initialization, include</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>//[[Rcpp::plugins("cpp11")]]
</code></pre></div></div>
<p>in <em>.cpp</em> source file comments.</p>
<ul>
<li><em>Rcpp</em> functions take inputs from <em>R</em>, and <em>R</em> doesn’t have variable type <em>pointer</em>, thus when writing <em>Rcpp</em> functions we are not supposed to use pointers as input variable. <br />
If you don’t want the function to make a copy of your variable, you can specify the variable as e.g. <code class="language-plaintext highlighter-rouge">void my_fun(int &var){}</code>, i.e. passing a reference to the variable instead.</li>
</ul>Li Zengli.zeng@yale.eduPakcage Rcpp allows you to use C++ or C code in an R environment. It’s a great tool to enhance speed of your program, at the price of longer programming and harder debugging. But when it finally works out, it’s totally worth it.