first non-repeating character in a string

yuanhsh

浏览: 569997 次
性别:
来自: 上海

最近访客更多访客>>

loginboot

Michael_113

z164974775

梦风云

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Technical Interviews
String

Given a string, find the first non-repeating character in it. For example, if the input string is “GeeksforGeeks”, then output should be ‘f’ and if input string is “GeeksQuiz”, then output should be ‘G’.

We can use string characters as index and build a count array. Following is the algorithm.

1) Scan the string from left to right and construct the count array.
2) Again, scan the string from left to right and check for count of each
 character, if you find an element who's count is 1, return it.

Example:

Input string: str = geeksforgeeks
1: Construct character count array from the input string.
   ....
  count['e'] = 4
  count['f'] = 1
  count['g'] = 2
  count['k'] = 2
  ……
2: Get the first character who's count is 1 ('f').

Implementation:

#include<stdlib.h>
#include<stdio.h>
#define NO_OF_CHARS 256
 
/* Returns an array of size 256 containg count
   of characters in the passed char array */
int *getCharCountArray(char *str) {
   int *count = (int *)calloc(sizeof(int), NO_OF_CHARS);
   int i;
   for (i = 0; *(str+i);  i++)
      count[*(str+i)]++;
   return count;
}
 
/* The function returns index of first non-repeating
   character in a string. If all characters are repeating 
   then returns -1 */
int firstNonRepeating(char *str) {
  int *count = getCharCountArray(str);
  int index = -1, i;
 
  for (i = 0; *(str+i);  i++) {
    if (count[*(str+i)] == 1) {
      index = i;
      break;
    }   
  }  
   
  free(count); // To avoid memory leak
  return index;
}

Can we do it by traversing the string only once?
The above approach takes O(n) time, but in practice it can be improved. The first part of the algorithm runs through the string to construct the count array (in O(n) time). This is reasonable. But the second part about running through the string again just to find the first non-repeater is not good in practice. In real situations, your string is expected to be much larger than your alphabet. Take DNA sequences for example: they could be millions of letters long with an alphabet of just 4 letters. What happens if the non-repeater is at the end of the string? Then we would have to scan for a long time (again).
We can augment the count array by storing not just counts but also the index of the first time you encountered the character e.g. (3, 26) for ‘a’ meaning that ‘a’ got counted 3 times and the first time it was seen is at position 26. So when it comes to finding the first non-repeater, we just have to scan the count array, instead of the string. Thanks to Ben for suggesting this approach.

Following is C implementation of the extended approach that traverses the input string only once.

#include <stdlib.h>
#include <stdio.h>
#include <limits.h>
#define NO_OF_CHARS 256
 
// Structure to store count of a character and index of the first
// occurrence in the input string
struct countIndex {
   int count;
   int index;
};
 
/* Returns an array of above structure type. The size of
   array is NO_OF_CHARS */
struct countIndex *getCharCountArray(char *str) {
   struct countIndex *count =
        (struct countIndex *)calloc(sizeof(countIndex), NO_OF_CHARS);
   int i;
   for (i = 0; *(str+i);  i++) {
      (count[*(str+i)].count)++;
 
      // If it's first occurrence, then store the index
      if (count[*(str+i)].count == 1)
         count[*(str+i)].index = i;
   }
   return count;
}
 
/* The function returns index of the first non-repeating
    character in a string. If all characters are repeating
    then reurns INT_MAX */
int firstNonRepeating(char *str) {
  struct countIndex *count = getCharCountArray(str);
  int result = INT_MAX, i;
 
  for (i = 0; i < NO_OF_CHARS;  i++) {
    // If this character occurs only once and appears
    // before the current result, then update the result
    if (count[i].count == 1 && result > count[i].index)
       result = count[i].index;
  }
 
  free(count); // To avoid memory leak
  return result;
}

Reference:

http://www.geeksforgeeks.org/given-a-string-find-its-first-non-repeating-character/

分享到：

Find the two repeating elements in a giv ... | LeetCode 28 - Implement strStr()

2015-02-14 03:23
浏览 972
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论