1. Java char data type: A 16-bit unsigned integer.
-- Supports original 16-bit Unicode.
-- Supports 21-bit Unicode 3.0 (awkwardly).
2. String data type in Java: Sequence of characters (immutable).
-- Length: Number of characters.
-- Indexing: Get the ith character.
-- Substring extraction: Get a contiguous subsequence of characters.
-- String concatenation: Append one character to end of another string.
-- Implementation:
public final class String implements Comparable<String> { private char[] value; // characters private int offset; // index of first char in array private int length; // length of string private int hash; // cache of hashCode() public int length() { return length; } public char charAt(int i) { return value[i + offset]; } private String(int offset, int length, char[] value) { this.offset = offset; this.length = length; this.value = value; } public String substring(int from, int to) { return new String(offset + from, to - from, value); } ... }
-- Memory: 40 + 2N bytes for a virgin String of length N.
-- Performance:
3. The StringBuilder data type
-- Sequence of characters (mutable).
-- Underlying implementation: Resizing char[] array and length.
-- StringBuffer data type is similar, but thread safe (and slower).
4. How to efficiently reverse a string?
public static String reverse(String s) { StringBuilder rev = new StringBuilder(); for (int i = s.length() - 1; i >= 0; i--) rev.append(s.charAt(i)); return rev.toString(); }
5. How to efficiently form array of suffixes?
public static String[] suffixes(String s) { int N = s.length(); String[] suffixes = new String[N]; for (int i = 0; i < N; i++) suffixes[i] = s.substring(i, N); return suffixes; }
6. How long to compute length of longest common prefix?
public static int lcp(String s, String t) { int N = Math.min(s.length(), t.length()); for (int i = 0; i < N; i++) if (s.charAt(i) != t.charAt(i)) return i; return N; }
7. Key-indexed counting
-- Assumption: Keys are integers between 0 and R - 1.
-- Applications:
- Sort string by first letter.
- Sort class roster by section.
- Sort phone numbers by area code.
- Subroutine in a sorting algorithm.
-- Algorithm
- Count frequencies of each letter using key as index.
- Compute frequency cumulates which specify destinations.
- Access cumulates using key as index to move items.
- Copy back into original array.
-- Implementation:
int N = a.length; int[] count = new int[R+1]; for (int i = 0; i < N; i++) count[a[i]+1]++; for (int r = 0; r < R; r++) count[r+1] += count[r]; for (int i = 0; i < N; i++) aux[count[a[i]]++] = a[i]; for (int i = 0; i < N; i++) a[i] = aux[i];
-- Stable
8. LSD string (radix) sort
-- Consider characters from right to left.
-- Stably sort using dth character as the key (using key-indexed counting).
-- Pf. [by induction on i] After pass i, strings are sorted by last i characters.
-- If two strings differ on sort key, key-indexed sort puts them in proper relative order.
-- If two strings agree on sort key, stability keeps them in proper relative order.
-- Implementation:
public class LSD { public static void sort(String[] a, int W) { int R = 256; // extended ASCII Alphabets int N = a.length; String[] aux = new String[N]; for (int d = W-1; d >= 0; d--) { int[] count = new int[R+1]; for (int i = 0; i < N; i++) count[a[i].charAt(d) + 1]++; for (int r = 0; r < R; r++) count[r+1] += count[r]; for (int i = 0; i < N; i++) aux[count[a[i].charAt(d)]++] = a[i]; for (int i = 0; i < N; i++) a[i] = aux[i]; } } }
9. MSD string (radix) sort.
-- Partition array into R pieces according to first character (use key-indexed counting).
-- Recursively sort all strings that start with each character (key-indexed counts delineate subarrays to sort).
-- Variable-length strings : Treat strings as if they had an extra char at end (smaller than any char).
-- Implementation:
public class MSD { public static void sort(String[] a) { aux = new String[a.length]; sort(a, aux, 0, a.length - 1, 0); } private static void sort(String[] a, String[] aux, int lo, int hi, int d) { int R = 256 // extended ASCII Alphabets if (hi <= lo) return; int[] count = new int[R+2]; // R+1 values, including the special small value for padding for (int i = lo; i <= hi; i++) count[charAt(a[i], d) + 2]++; // alphabet j is counted at count[j + 2] for (int r = 0; r < R+1; r++) count[r+1] += count[r]; for (int i = lo; i <= hi; i++) aux[count[charAt(a[i], d) + 1]++] = a[i]; // after accumulating number of alphabets less than j is count[j + 2 - 1] for (int i = lo; i <= hi; i++) a[i] = aux[i - lo]; for (int r = 0; r < R; r++) sort(a, aux, lo + count[r], lo + count[r+1] - 1, d+1); // after moving data from a[] to aux[], count[r] records the number of alphabits less than r } private static int charAt(String s, int d) { if (d < s.length()) return s.charAt(d); else return -1; } }
-- Performance issues
-- Much too slow for small subarrays.
- Each function call needs its own count[] array.
- ASCII (256 counts): 100x slower than copy pass for N = 2.
- Unicode (65,536 counts): 32,000x slower for N = 2.
-- Huge number of small subarrays because of recursion.
-- Solution: Cut off to insertion sort for small subarrays.
-- Insertion sort, but start at dth character.
-- Implement less() so that it compares starting at dth character.
public static void sort(String[] a, int lo, int hi, int d) { for (int i = lo; i <= hi; i++) for (int j = i; j > lo && less(a[j], a[j-1], d); j--) exch(a, j, j-1); } private static boolean less(String v, String w, int d) { return v.substring(d).compareTo(w.substring(d)) < 0; }
-- performance
-- MSD examines just enough characters to sort the keys.
-- Number of characters examined depends on keys.
10. MSD string sort vs. quicksort for strings
-- Disadvantages of MSD string sort.
- Extra space for aux[]. (caused by counting sort)
- Extra space for count[]. (caused by counting sort)
- Inner loop has a lot of instructions. (caused by counting sort)
- Accesses memory "randomly" (cache inefficient).
-- Disadvantage of quicksort.
- Linearithmic number of string compares (not linear).
- Has to rescan many characters in keys with long prefix matches. ( for comparison)
11. 3-way string quicksort
-- Do 3-way partitioning on the dth character.
-- Less overhead than R-way partitioning in MSD string sort.
-- Does not re-examine characters equal to the partitioning char
(but does re-examine characters not equal to the partitioning char).
-- Implementation
private static void sort(String[] a) { sort(a, 0, a.length - 1, 0); } private static void sort(String[] a, int lo, int hi, int d) { if (hi <= lo) return; int lt = lo, gt = hi; int v = charAt(a[lo], d); int i = lo + 1; while (i <= gt) { int t = charAt(a[i], d); if (t < v) exch(a, lt++, i++); else if (t > v) exch(a, i, gt--); else i++; } // sort 3 subarrays recursively sort(a, lo, lt-1, d); if (v >= 0) sort(a, lt, gt, d+1); sort(a, gt+1, hi, d); sort(a, gt+1, hi, d); }
-- Performance
-- Standard quicksort uses ~ 2 N ln N string compares on average. Costly for keys with long common prefixes (and this is a common case!)
-- 3-way string (radix) quicksort uses ~ 2 N ln N character compares on average for random strings. Avoids re-comparing long common prefixes.
12. Keyword-in-context search:
-- Given a text of N characters, preprocess it to enable fast substring search (find all occurrences of query string context).
-- Solution : suffix sort the text and binary search for query; scan until mismatch.
13. Longest repeated substring
-- Given a string of N characters, find the longest repeated substring.
-- Implementation :
public String lrs(String s) { int N = s.length(); String[] suffixes = new String[N]; for (int i = 0; i < N; i++) suffixes[i] = s.substring(i, N); Arrays.sort(suffixes); String lrs = ""; for (int i = 0; i < N-1; i++) { int len = lcp(suffixes[i], suffixes[i+1]); if (len > lrs.length()) lrs = suffixes[i].substring(0, len); } return lrs; }
-- Bad Input : longest repeated substring very long.
-- LRS needs at least 1 + 2 + 3 + ... + D character compares, where D = length of longest match.
14. Manber-Myers MSD algorithm
-- Suffix sorting in linearithmic time.
-- Algorithm
- Phase 0: sort on first character using key-indexed counting sort.
- Phase i: given array of suffixes sorted on first 2^(i-1) characters, create array of suffixes sorted on first 2^i characters.
-- Worst-case running time: N lg N.
-- Finishes after lg N phases.
-- Can perform a phase in linear time.
相关推荐
5.1 String Sorts 702 5.2 Tries 730 5.3 Substring Search 758 5.4 Regular Expressions 788 5.5 Data Compression 810 6 Context . . . . . . . . . . . . . . . . . . . . . . . 853 Index . . . . . . . . . . ....
8. **字符串排序(String Sorts)**:091_51StringSorts.pdf 专门讨论字符串的排序问题,涉及到字符串比较的细节和特定的排序算法,如Trie树、Boyer-Moore排序等。 9. **基础符号表(Elementary Symbol Tables)**:061...
private String sorts; @Id @Column(name = "UUID") public String getUuid() { return uuid; } public void setUuid(String uuid) { this.uuid = uuid; } @Basic @Column(name = "NAME") public ...
string str_chang = "select sorts.sid, sortname from sorts where categoryid=" + category.SelectedValue.ToString(); sorts.DataSource = cate_chang.todateset(str_chang); sorts.DataTextField = "sortname...
public static void main(String[] args) { // TODO Auto-generated method stub Sort s=new Sort(); int[] arr = { 3, 44, 38, 5, 47, 15, 36, 26, 27, 2, 46, 4, 19, 50, 48 }; s.bubbleSorts(arr);//冒泡排序...
A handy guide of sorts for any computer science professional, Data Structures And Algorithms Made Easy: Data Structure And Algorithmic Puzzles is a solution bank for various complex problems related ...
For example, `ls -l | sort -n` sorts the list of files by size. - **File Tests**: Check file attributes using tests like `-d` (directory), `-f` (regular file), `-r` (readable), `-w` (writable), and `...
**Example Answer**: To write a program that sorts elements, you can use various sorting algorithms available in Java, such as `Arrays.sort()`, `Collections.sort()`, or implement your own sorting logic...
sorty特定于类型的快速并发/...例如:sorty.SortS(string_slice)//原生切片sorty.Sort(n,lesswap)// lesswap()基于Mxg的函数(默认为3)是用于每个Sort *()调用进行排序的goroutine的最大数量。 使用语义ve
filter: ".ignore-elements", // Selectors that do not lead to dragging (String or Function) preventOnFilter: true, // Call `event.preventDefault()` when triggered `filter` draggable: ".item", // ...
你可以通过 `MongoDatabase.getCollection(String name)` 方法获取指定集合。 3. **文档操作**:MongoDB 中的数据以 JSON 格式的文档存储。Java 驱动提供了 `Document` 类来表示这些文档。插入文档可以通过 `...
- `l.sort(f)`: Sorts the list using the comparison function `f` (default is `cmp`). - **Dictionary Operations:** - `d = {'x': 42, 'y': 3.14, 'z': 7}`: Creates a dictionary. - `d['x']`: Retrieves ...
_.countBy(list, iterator, [context]) Sorts a list into groups and returns a count for the number of objects in each group. Similar to groupBy _.shuffle(list) Returns a shuffled copy of the list, using...
public static void main(String[] args) { MongoClient mongoClient = MongoClients.create("mongodb://localhost:27017"); MongoDatabase database = mongoClient.getDatabase("testDatabase"); // 进行其他...
This rule rewrites URLs like `/usermanagement/Alice/baz` to the query string format. It works because the web server interprets the rewritten URL and serves the appropriate ...
Fig01_25.cpp: Using function objects: Case insensitive string comparison LambdaExample.cpp: (Not in the book): rewriting Fig 1.25 with lambdas MaxSumTest.cpp: Various maximum subsequence sum ...
- **String Comparisons**: 讨论字符串比较的高级策略。 - **Advanced Setting**: 探讨排序组件的一些高级设置选项。 - **Sort Considerations**: 提供在使用排序时需要注意的关键事项。 - **Sort Wrap-Up**: ...
10. `$selects`, `$wheres`, `$sorts`, `$limit`, `$offset`:用于构建查询语句的参数,分别对应选择字段、过滤条件、排序、限制数量和偏移量。 11. `$timeout`:连接超时时间。 12. `$key`:用于处理多个连接配置的...
4.4 Output as String Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.1 Converting Numbers to Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 ...