jk2016 February 2016

Find word using Java

I am trying to write a Java class to find word surrounded by ( ) in text file and output the word and its occurrences in different line.

How can I write this in Java?

Input file

School (AAA) to (AAA) 10/22/2011 ssss(ffs)
(ffs) 7368 House 8/22/2011(h76yu)  come 789  (AAA)
Car (h76yu) to  (h76yu) extract9998790
2/3/2015 (AAA) 

Output file

(AAA) 4    
(ffs) 2    
(h76yu) 3 

This is what I got so far..

public class  FindTextOccurances  {
public static void main(String[] args) throws IOException {

    int sum=0
    String line = value.toString();

    for (String word : line.split("(\\W+")) {
        if (word.charAt(0) == '(‘ ) {
            if (word.length() > 0) {
                sum +=line.get();
            }
            context.write(new Text(word), new IntWritable(sum));
        } 
    }
}

Answers


Diego Martinoia February 2016

If you split on "(\W+)" you are going to keep ALL the things that ARE NOT between parenthesis (as you are splitting on the parenthesized words).

What you want is a matcher:

import java.util.regex.Matcher;
import java.util.regex.Pattern;
...
Map<String, Int> occurrences = new HashMap<>();
Matcher m = Pattern.compile("(\\W+)").matcher(myString);
while (m.find()) {
  String matched = m.group();
  String word =matched.substring(1, matched.length()-1); //remove parenthesis
  occurrences.put(word, occurences.getOrDefault(word, 0)+1);
 }


Andy Turner February 2016

You can find the text between brackets without splitting or using regular expressions like so (assuming that all brackets are closed, and you don't have nested brackets):

int lastBracket = -1;
while (true) {
  int start = line.indexOf('(', lastBracket + 1);
  if (start == -1) {
    break;
  }
  int end = line.indexOf(')', start + 1);

  System.out.println(line.substring(start + 1, end - 1);

  lastBracket = start;
}


Akhil February 2016

This may help i did it with regular expressions i did not declared variables adjust them as to your needs.I wish this may solve your problem

 BufferedReader fr = new BufferedReader(new InputStreamReader(new FileInputStream(file), "ASCII"));
    while(true)
    {
        String line = fr.readLine();
        if(line==null)
            break;
        String[] words = line.split(" ");//those are your words
    }
  for(int i = 0;i<=words.length();i++)
    {
        String a = words[i];
          if(a.matches("[(a-z)]+"))
             {
               j=i;
               while(j<=words.length();)
                 {
                        count++;
                 }
              System.out.println(a+" "+count);
             }
    }

Post Status

Asked in February 2016
Viewed 1,013 times
Voted 7
Answered 3 times

Search




Leave an answer