`
leonzhx
  • 浏览: 797904 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  Java char data type: A 16-bit unsigned integer.

    --  Supports original 16-bit Unicode.

    --  Supports 21-bit Unicode 3.0 (awkwardly).

 

2.  String data type in Java: Sequence of characters (immutable).

    --  Length: Number of characters.

    --  Indexing: Get the ith character.

    --  Substring extraction: Get a contiguous subsequence of characters.

    --  String concatenation: Append one character to end of another string.


 

    --  Implementation:

 

public final class String implements Comparable<String>
{
    private char[] value; // characters
    private int offset; // index of first char in array
    private int length; // length of string
    private int hash; // cache of hashCode()
    public int length()
    { return length; }
    public char charAt(int i)
    { return value[i + offset]; }

    private String(int offset, int length, char[] value)
    {
        this.offset = offset;
        this.length = length;
        this.value = value;
    }
    public String substring(int from, int to)
    { return new String(offset + from, to - from, value); }

...
}

     --  Memory: 40 + 2N bytes for a virgin String of length N.

 

     --  Performance:


 

3.  The StringBuilder data type

    --  Sequence of characters (mutable).

    --  Underlying implementation: Resizing char[] array and length.

    --  StringBuffer data type is similar, but thread safe (and slower).

 

4.  How to efficiently reverse a string?

 

public static String reverse(String s)
{
    StringBuilder rev = new StringBuilder();
    for (int i = s.length() - 1; i >= 0; i--)
        rev.append(s.charAt(i));
    return rev.toString();
}

 

 

5.  How to efficiently form array of suffixes?


 

public static String[] suffixes(String s)
{
    int N = s.length();
    String[] suffixes = new String[N];
    for (int i = 0; i < N; i++)
        suffixes[i] = s.substring(i, N);
    return suffixes;
}

 

 

6.  How long to compute length of longest common prefix?

 

public static int lcp(String s, String t)
{
    int N = Math.min(s.length(), t.length());
    for (int i = 0; i < N; i++)
        if (s.charAt(i) != t.charAt(i))
            return i;
    return N;
}

 

 

7.  Key-indexed counting

    --  Assumption: Keys are integers between 0 and R - 1.

    --  Applications:

        -  Sort string by first letter.

        -  Sort class roster by section.

        -  Sort phone numbers by area code.

        -  Subroutine in a sorting algorithm.

    --  Algorithm

        -  Count frequencies of each letter using key as index.

        -  Compute frequency cumulates which specify destinations.

        -  Access cumulates using key as index to move items.

        -  Copy back into original array.

    --  Implementation:

 

int N = a.length;
int[] count = new int[R+1];
for (int i = 0; i < N; i++)
    count[a[i]+1]++;
for (int r = 0; r < R; r++)
    count[r+1] += count[r];
for (int i = 0; i < N; i++)
    aux[count[a[i]]++] = a[i];
for (int i = 0; i < N; i++)
    a[i] = aux[i];

 

     --  Stable

 


8.  LSD string (radix) sort

    --  Consider characters from right to left.

    --  Stably sort using dth character as the key (using key-indexed counting).

    --  Pf. [by induction on i] After pass i, strings are sorted by last i characters.

        --  If two strings differ on sort key, key-indexed sort puts them in proper relative order.

        --  If two strings agree on sort key, stability keeps them in proper relative order.

    --  Implementation:

 

public class LSD
{
    public static void sort(String[] a, int W)
    {
        int R = 256; // extended ASCII Alphabets
        int N = a.length;
        String[] aux = new String[N];
        for (int d = W-1; d >= 0; d--)
        {
            int[] count = new int[R+1];
            for (int i = 0; i < N; i++)
                count[a[i].charAt(d) + 1]++;
            for (int r = 0; r < R; r++)
                count[r+1] += count[r];
            for (int i = 0; i < N; i++)
                aux[count[a[i].charAt(d)]++] = a[i];
            for (int i = 0; i < N; i++)
                a[i] = aux[i];
        }
    }
}

 

 

 

9.  MSD string (radix) sort.

    --  Partition array into R pieces according to first character (use key-indexed counting).

    --  Recursively sort all strings that start with each character (key-indexed counts delineate subarrays to sort).

    --  Variable-length strings : Treat strings as if they had an extra char at end (smaller than any char).

    --  Implementation:

 

public class MSD {
    public static void sort(String[] a)
    {
        aux = new String[a.length];
        sort(a, aux, 0, a.length - 1, 0);
    }

    private static void sort(String[] a, String[] aux, int lo, int hi, int d)
    {
        int R = 256 // extended ASCII Alphabets
        if (hi <= lo) return;
        int[] count = new int[R+2]; // R+1 values, including the special small value for padding
        for (int i = lo; i <= hi; i++)
            count[charAt(a[i], d) + 2]++; // alphabet j is counted at count[j + 2]
        for (int r = 0; r < R+1; r++)
            count[r+1] += count[r];
        for (int i = lo; i <= hi; i++)
            aux[count[charAt(a[i], d) + 1]++] = a[i]; // after accumulating number of alphabets less than j is count[j + 2 - 1]
        for (int i = lo; i <= hi; i++)
            a[i] = aux[i - lo];
        for (int r = 0; r < R; r++)
            sort(a, aux, lo + count[r], lo + count[r+1] - 1, d+1); // after moving data from a[] to aux[], count[r] records the number of alphabits less than r
    }

    private static int charAt(String s, int d)
    {
        if (d < s.length()) return s.charAt(d);
        else return -1;
    }
}

 

 

     --  Performance issues

        --  Much too slow for small subarrays.

            -  Each function call needs its own count[] array.

            -  ASCII (256 counts): 100x slower than copy pass for N = 2.

            -  Unicode (65,536 counts): 32,000x slower for N = 2.

        --  Huge number of small subarrays because of recursion.

    --  Solution: Cut off to insertion sort for small subarrays.

        --  Insertion sort, but start at dth character.

        --  Implement less() so that it compares starting at dth character.

public static void sort(String[] a, int lo, int hi, int d)
{
    for (int i = lo; i <= hi; i++)
        for (int j = i; j > lo && less(a[j], a[j-1], d); j--)
            exch(a, j, j-1);
}

private static boolean less(String v, String w, int d)
{ return v.substring(d).compareTo(w.substring(d)) < 0; }

     --  performance

        --  MSD examines just enough characters to sort the keys.

        --  Number of characters examined depends on keys.


 

10.  MSD string sort vs. quicksort for strings

    --  Disadvantages of MSD string sort.

        -  Extra space for aux[]. (caused by counting sort)

        -  Extra space for count[]. (caused by counting sort)

        -  Inner loop has a lot of instructions. (caused by counting sort)

        -  Accesses memory "randomly" (cache inefficient).

    --  Disadvantage of quicksort.

        -  Linearithmic number of string compares (not linear).

        -  Has to rescan many characters in keys with long prefix matches. ( for comparison)

 

11.  3-way string quicksort

    --  Do 3-way partitioning on the dth character.

    --  Less overhead than R-way partitioning in MSD string sort.

    --  Does not re-examine characters equal to the partitioning char

       (but does re-examine characters not equal to the partitioning char).

    --  Implementation

private static void sort(String[] a)
{ sort(a, 0, a.length - 1, 0); }

private static void sort(String[] a, int lo, int hi, int d)
{
    if (hi <= lo) return;
    int lt = lo, gt = hi;
    int v = charAt(a[lo], d);
    int i = lo + 1;
    while (i <= gt)
    {
        int t = charAt(a[i], d);
        if (t < v) exch(a, lt++, i++);
        else if (t > v) exch(a, i, gt--);
        else i++;
    }
    // sort 3 subarrays recursively
    sort(a, lo, lt-1, d);
    if (v >= 0) sort(a, lt, gt, d+1);
        sort(a, gt+1, hi, d);
    sort(a, gt+1, hi, d);
}



 

     --  Performance

        --  Standard quicksort uses ~ 2 N ln N string compares on average. Costly for keys with long common prefixes (and this is a common case!)

        --  3-way string (radix) quicksort uses ~ 2 N ln N character compares on average for random strings. Avoids re-comparing long common prefixes.



 

12.  Keyword-in-context search:

    --  Given a text of N characters, preprocess it to enable fast substring search (find all occurrences of query string context).

    --  Solution : suffix sort the text and binary search for query; scan until mismatch.




 

13.  Longest repeated substring

    --  Given a string of N characters, find the longest repeated substring.

    --  Implementation :

public String lrs(String s)
{
    int N = s.length();
    String[] suffixes = new String[N];
    for (int i = 0; i < N; i++)
        suffixes[i] = s.substring(i, N);
    Arrays.sort(suffixes);
    String lrs = "";
    for (int i = 0; i < N-1; i++)
    {
        int len = lcp(suffixes[i], suffixes[i+1]);
        if (len > lrs.length())
            lrs = suffixes[i].substring(0, len);
    }
    return lrs;
}

 

     --  Bad Input : longest repeated substring very long.

        --  LRS needs at least 1 + 2 + 3 + ... + D character compares, where D = length of longest match.

 

14.  Manber-Myers MSD algorithm

    --  Suffix sorting in linearithmic time.

    --  Algorithm

        -  Phase 0: sort on first character using key-indexed counting sort.

        -  Phase i: given array of suffixes sorted on first 2^(i-1) characters, create array of suffixes sorted on first 2^i characters.



 

    --  Worst-case running time: N lg N.

        --  Finishes after lg N phases.

        --  Can perform a phase in linear time.

 

  • 大小: 18.9 KB
  • 大小: 12.8 KB
  • 大小: 31.3 KB
  • 大小: 28 KB
  • 大小: 43.7 KB
  • 大小: 60.3 KB
  • 大小: 43.7 KB
  • 大小: 87.3 KB
  • 大小: 55.5 KB
  • 大小: 50.9 KB
  • 大小: 54.7 KB
  • 大小: 42.2 KB
  • 大小: 46.5 KB
  • 大小: 27.2 KB
  • 大小: 60.9 KB
分享到:
评论

相关推荐

    算法第四版(algorithms),2011年新出版,算法设计力作

    5.1 String Sorts 702 5.2 Tries 730 5.3 Substring Search 758 5.4 Regular Expressions 788 5.5 Data Compression 810 6 Context . . . . . . . . . . . . . . . . . . . . . . . 853 Index . . . . . . . . . . ....

    算法ppt.zip

    8. **字符串排序(String Sorts)**:091_51StringSorts.pdf 专门讨论字符串的排序问题,涉及到字符串比较的细节和特定的排序算法,如Trie树、Boyer-Moore排序等。 9. **基础符号表(Elementary Symbol Tables)**:061...

    idea hibernate jpa 生成实体类的实现

    private String sorts; @Id @Column(name = "UUID") public String getUuid() { return uuid; } public void setUuid(String uuid) { this.uuid = uuid; } @Basic @Column(name = "NAME") public ...

    二级动态联动菜单源码

    string str_chang = "select sorts.sid, sortname from sorts where categoryid=" + category.SelectedValue.ToString(); sorts.DataSource = cate_chang.todateset(str_chang); sorts.DataTextField = "sortname...

    八种排序方法附实现源码.zip

    public static void main(String[] args) { // TODO Auto-generated method stub Sort s=new Sort(); int[] arr = { 3, 44, 38, 5, 47, 15, 36, 26, 27, 2, 46, 4, 19, 50, 48 }; s.bubbleSorts(arr);//冒泡排序...

    Data.Structures.and.Algorithms.Made.Easy.epub

    A handy guide of sorts for any computer science professional, Data Structures And Algorithms Made Easy: Data Structure And Algorithmic Puzzles is a solution bank for various complex problems related ...

    shell cheat sheet

    For example, `ls -l | sort -n` sorts the list of files by size. - **File Tests**: Check file attributes using tests like `-d` (directory), `-f` (regular file), `-r` (readable), `-w` (writable), and `...

    java英文笔试

    **Example Answer**: To write a program that sorts elements, you can use various sorting algorithms available in Java, such as `Arrays.sort()`, `Collections.sort()`, or implement your own sorting logic...

    Go中的快速并发/并行排序-Golang开发

    sorty特定于类型的快速并发/...例如:sorty.SortS(string_slice)//原生切片sorty.Sort(n,lesswap)// lesswap()基于Mxg的函数(默认为3)是用于每个Sort *()调用进行排序的goroutine的最大数量。 使用语义ve

    Sortable前端框架

    filter: ".ignore-elements", // Selectors that do not lead to dragging (String or Function) preventOnFilter: true, // Call `event.preventDefault()` when triggered `filter` draggable: ".item", // ...

    mongoDB 操作 java源代码

    你可以通过 `MongoDatabase.getCollection(String name)` 方法获取指定集合。 3. **文档操作**:MongoDB 中的数据以 JSON 格式的文档存储。Java 驱动提供了 `Document` 类来表示这些文档。插入文档可以通过 `...

    Python 2.5 Reference Card

    - `l.sort(f)`: Sorts the list using the comparison function `f` (default is `cmp`). - **Dictionary Operations:** - `d = {'x': 42, 'y': 3.14, 'z': 7}`: Creates a dictionary. - `d['x']`: Retrieves ...

    lodash underscore js库速查手册

    _.countBy(list, iterator, [context]) Sorts a list into groups and returns a count for the number of objects in each group. Similar to groupBy _.shuffle(list) Returns a shuffled copy of the list, using...

    java mongogb简单实例

    public static void main(String[] args) { MongoClient mongoClient = MongoClients.create("mongodb://localhost:27017"); MongoDatabase database = mongoClient.getDatabase("testDatabase"); // 进行其他...

    php developer interview

    This rule rewrites URLs like `/usermanagement/Alice/baz` to the query string format. It works because the web server interprets the rewritten URL and serves the appropriate ...

    c++数据结构与算法实现

    Fig01_25.cpp: Using function objects: Case insensitive string comparison LambdaExample.cpp: (Not in the book): rewriting Fig 1.25 with lambdas MaxSumTest.cpp: Various maximum subsequence sum ...

    31 Days of SSIS

    - **String Comparisons**: 讨论字符串比较的高级策略。 - **Advanced Setting**: 探讨排序组件的一些高级设置选项。 - **Sort Considerations**: 提供在使用排序时需要注意的关键事项。 - **Sort Wrap-Up**: ...

    php封装的mongodb操作类代码

    10. `$selects`, `$wheres`, `$sorts`, `$limit`, `$offset`:用于构建查询语句的参数,分别对应选择字段、过滤条件、排序、限制数量和偏移量。 11. `$timeout`:连接超时时间。 12. `$key`:用于处理多个连接配置的...

    python programming

    4.4 Output as String Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.1 Converting Numbers to Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 ...

Global site tag (gtag.js) - Google Analytics