Software Engineer

Find All Collinear Points - A Pattern Recognition Problem

The Line Patterns Recognition A basic but important application of pattern recognition is to recognize line patterns in a given set of points. http://coursera.cs.princeton.edu/algs4/assignments/collinear.html. This blog will give a breif introduction to this problem and provide an enfficient solution. Codes available in algs4/collinear/src/ The problem could be described as: Given a set of n distinct points in the plane, find every (maximal) line segment that connects a subset of 4 or more of the points.. ...

Randomized Queue with Reservoir Sampling

This blog explains an apllication of randomized queue algorithms. Permutation client memory challenge A client program Permutation.java that takes an integer k as a command-line argument; reads in a sequence of strings from standard input using StdIn.readString(); and prints exactly k of them, uniformly at random. Print each item from the sequence at most once. More detail could be found at programming assignment specification and checklist, codes available in algs4/queues/src/. Randomized queue For a randomized queue, the item removed is chosen uniformly at random from items in the data structure. ...

Percolations problem

Union-find applications: Percolation Problem discriptions Percolation data type. To model a percolation system, create a data type Percolation with the following API: public class Percolation { public Percolation(int n); // create n-by-n grid, with all sites blocked public void open(int row, int col); // open site (row, col) if it is not open already public boolean isOpen(int row, int col); // is site (row, col) open? public boolean isFull(int row, int col); // is site (row, col) full? public int numberOfOpenSites(); // number of open sites public boolean percolates(); // does the system percolate? } Monte Carlo simulation. To estimate the percolation threshold, consider the following computational experiment: ...

Inf Course Note - Parallel Programming Language and Systems

爱丁堡大学信息学院课程笔记 Parallel Programming Language and Systems, Informatics, University of Edinburgh Reference: http://www.inf.ed.ac.uk/teaching/courses/ppls/ CMU 15213: Introduction to Computer Systems (ICS) Computer Systems: A Programmer’s Perspective A Comprehensive MPI Tutorial Resource A chapter on MPI from Ian Foster’s online Book Designing and Building Parallel Programs Introduction to parallel computer architecture Covering some of the nasty issues presented by the shared memory model, including weak consistency models and false sharing in the cache, and some architectural issues for the multicomputer model. ...

Inf Course Note - Software Architecture, Process, and Management

爱丁堡大学信息学院课程笔记 Software Architecture, Process, and Management, Informatics, University of Edinburgh Reference: microsoft IBM Software Architecture in Practice (3rd edition), Bass, Clements, and Kazman What is Software Architecture? Software architecture is often described as the organization or structure of a system, where the system represents a collection of components that accomplish a specific function or set of functions. grouping components into areas of concern (layers): For example, the UI, business processing, and data access. focus on interaction between the components and how different components work together. 在书中的定义： ...

Inf Course Note - Software Testing

爱丁堡大学信息学院课程笔记 Software Testing, Informatics, University of Edinburgh Reference: http://www.inf.ed.ac.uk/teaching/courses/st/2017-18/index.html Pezze and Young, Software Testing and Analysis: Process, Principles and Techniques, Wiley, 2007. Why Software Testing? 1, 软件的漏洞, 错误和失效 Software Faults, Errors & Failures The problem start with Faults, Fault(BUG): latent error, mistakes in programming. e.g add(x, y) = x * y. With the Faults in programs, if and only if executing add(x, y) = x * y, the fault being activated, and generate an Errors. ...

Algorithms - Princeton

Algorithms, Part I, https://online.princeton.edu/course/algorithms-part-i Algorithms, Part II, https://online.princeton.edu/course/algorithms-part-ii Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne https://algs4.cs.princeton.edu/ Union−Find Considering the dynamic connectivity problem, modeling of multiple objects connected in a space/network. Applications involve manipulating objects of all types. ・Pixels in a digital photo. ・Computers in a network. ・Friends in a social network. ・Transistors in a computer chip. Given a set of N objects. union(a, b): connect two objects. connected(p, q): is two objects connected? find(p): Find component identifier for p (0 to N – 1) Modeling the objects: array. ...

Bash 直接启动 sublime 或 atom 等编辑器以打开或新建文件

程序员或者其他需要码字多的人，经常要使用编辑器如sublime、atom 和 Typora等。如果每次都要用鼠标点击才能用sublime打开文件，或者在编辑器中新建文件，那么就会有点麻烦！但你可以用一句命令解决！配置在Git Bash中用各种文本编辑器打开文件或者直接新建文件。这里以atom为例。常规步骤打开Git Bash并cd到你的目标文件夹, 或者直接在目标文件中右键打开Git Bash. atom xxx.md 就会在弹出的atom窗口中打开名为xxx.md的markdown文件, 如果没有这个文件, 会自动创建一个. 适用于其他类型文件, 如.java等. 如果想用sublime, 可以用subl xxx.java, 同理notepad++ 可以用 notepad++ xxx.java等。 (若出现错误,看下面) 若系统无法识别命令一般使用sublime或者notepad++的用户, 可能会出现error: 系统无法识别命令...之类的, 可以这么解决: 方法1 新建一个文件命名为subl（注意不能有后缀名），内容： #!/bin/sh "D:\Sublime Text 3\sublime_text.exe" $1 & 第一行指明这是个 shell 脚本. 第二行的字符串是sublime的安装目录, 示例只是我电脑的目录, 注意这里要改为你自己的目录, 第二行的$1 是取的命令之后输入的参数第二行的&是此命令在后台打开，这样sublime打开之后，就不会阻塞你的git bash 文件保存到 C:\Program Files (x86)\Git\mingW32\bin 目录下(你的git目录可能与我的不一样，注意改成你自己的) 同理适用于其他编辑器，比如用chrome打开.html文件等。如果不想每次都新建一个文件，可以用下面的方法2。方法2 找到 C:\Users\你的计算机名目录，如果你的计算机名是Administrator，那么你就要去C:\Users\Administrator目录下, 这里一般存放着windows系统的我的文档, 桌面等文件夹. 在该目录下用Git Bash输入notepad .bashrc, 这会用windows记事本新建并打开一个文件.bashrc，这个文件没有名称只有后缀名。.bashrc里面可以给Git Bash设置命令的别名, 设置路径等。在.bashrc文件加入下面一行文本alias notepad++="/D/Notepad++/notepad++.exe", 这里你需要修改为你电脑的安装路径。alias就是别名的意思，当我们执行notepad++的时候，实际执行的是=后面的语句. 重新打开Git Bash, 设置才能生效，如果不想关掉在打开的话，可以直接在bash下输入source ~/.bashrc就可以立刻加载修改后的设置，设置立即生效。现在在bash下输入notepad++ test.py, 就直接打开了notepad++并创建了这个叫test的Python文件。这里的别名不一定非要取notepad++，随你想叫什么都行。同理也可以扩展到别的文本编辑器，alias atom="atom的路径", alias sublime="sublime的路径"等. 最后还要注意一点，上面所说的路径最好不要有空格，括号等，否则会造成命令无效. ...

位操作 - 快速幂

如何实现快速的幂运算？要求$c = a^b$, 按照朴素算法把a连乘b次的时间复杂度是$O(n)$. 而快速幂能做到$O(\log n)$。把b转换为二进制, 二进制数第i位的权为$2^{i-1}$，就可以把二进制拆分为若干个以2为底的真数, 然后利用幂数的性质，例如用朴素算法求$a^{11}$要求乘11次. 考虑到11的二进制为1011, 如果把$a^{11}$拆分为: $$a^{11} = a^{a_0 2^0 + a_1 2^1 + a_2 2^2 + a_3 2^3} = a^1 a^2 a^8$$ 可以看到每一个因子都是上一个因子的平方，利用$a^2 a^2$求出$a^4$, 同样利用$a^4$的平方求出$a^8$, 每次计算只需要用到上一次计算出来的结果, 所以总的运算次数是4次. 任何一个数b最多能写成长度为$O(\log b)$的二进制, 因此这个算法就是$O(\log n)$. 在程序设计中是根据b的二进制中是否为1来控制是否乘以上一次翻倍的积不断右移b, 直到b不再有1：根据当前位的权重（当前b最后一位）是否为1来决定c是否乘以最新的a 把a平方，用于下一位计算在Java中要考虑极端值INT_MIN // 递归 public double myPow(double x, int n) { if(n==0) return 1; double temp = myPow(x, n/2); if (n % 2 ==0) return temp * temp; else { if(n > 0) return x*temp*temp; else return (temp*temp) / x; } } // 循环 public double myPow(double x, int n) { double ans = 1; if(n < 0){ n = -(n+1); // 处理极端值 x = 1/x; ans *= x; } System.out.println(n); while (n > 0) { if ((n & 1) == 1) ans *= x; x *= x; n >>= 1; } return ans; } 快速幂取余求a^b mod c. 如果b是偶数, a^b mod c = $(a^2)^{b/2} \% c$ 如果b是奇数, a^b mod c = $((a^2)^{b/2} \times a) \% c$ ...

位操作 - 找数问题

“找出只出现一次的数”， “找出唯二的只出现M次的数”， “找出缺失的数”等等这类问题，都可以利用异或操作的特性，即一个整数和自己进行异或运算会归0的性质。找出缺失的数字问题1：给定一个包含n个不同数字的数组，取自0,1,2,...,n，找到数组中缺少的数字。最直觉的解法是利用等差数列的性质直接数学求解。但这个方法限制于等差数列. 问题2: 在一个长度为n的数组里的所有数字都在0 ~ n-1之间, 数组中有些数字是重复的, 找出任意一个重复的数字. 这也是«剑指offer»的一道题. 但是如果利用数组大小和元素范围的特性, 就可以发现, 这里的数组的大小和数字的范围是有限定关系的. 对于第二个问题, 假如没有重复, 那么重新排列的话数组每一个位置都可以放上自己对应的数字. 对于第一个问题, 假如没有缺失, 那么除了每一个index都可以重新放上自己对应的数字外, 还会多出一个最大的数字没地方放. 这样就可以把数组包含的数字解读为index, 然后在遍历检查数组时, 同时检查以各个数字为index的其他位置的数字. 使用这种思路可以同时解决两个问题, 这里以问题1解法为例: public int missingNumber(int[] nums) { int n = nums.length; int misP = n; // points to the position where misssing. for (int i = 0; i < n; i++) { while (i != nums[i] && nums[i] != misP) { int j = nums[i]; nums[i] = nums[j]; nums[j] = j; } if (nums[i] == misP) misP = i; } return misP; } 找出只出现一次的数问题3：在一个非空整数数组，找出那个只出现了一次的元素，已知其余每个元素均出现两次。 ...