Set with duplicates java - import from file - java

Multi tool use
Multi tool use


Set with duplicates java - import from file - java



I have a small project.



The project imports the txt file to String (coding similar to CSV - contains semicolons = ";").



In the next steps, the String is changed to ArrayList.



Then, using Predicate, I remove elements that do not interest me.



At the end I replace ArrayList on TreeSet to remove duplicates.
Unfortunately, there is a problem here because the duplicates occur ...



I checked in Notepadd ++ changing the encoding on ANSI to check whether there are no unnecessary characters.



Unfortunately, everything looks good and duplicates are still there.



Uploaded input file - https://drive.google.com/open?id=1OqIKUTvMwK3FPzNvutLu-GYpvocUsSgu



Any idea?


public class OpenSCV {
private static final String SAMPLE_CSV_FILE_PATH = "/Downloads/all.txt";

public static void main(String args) throws IOException {

File file = new File(SAMPLE_CSV_FILE_PATH);
String str = FileUtils.readFileToString(file, "utf-8");
str = str.trim();
String str2 = str.replace("n", ";").replace(""", "" ).replace("nn",";").replace("*www.*","")
.replace("u0000","").replace(",",";").replace(" ","").replaceAll(";{2,}",";");

List<String> lista1 = new ArrayList<>(Arrays.asList((str2.split(";"))));

Predicate<String> predicate = s -> !(s.contains("@"));

Set<String> removeDuplicates = new TreeSet<>(lista1);

removeDuplicates.removeIf(predicate);

String fileName2 = "/Downloads/allMails.txt";
try ( BufferedWriter bw =
new BufferedWriter (new FileWriter (fileName2)) )
{
for (String line : removeDuplicates) {
bw.write (line + "n");
}
bw.close ();
} catch (IOException e) {
e.printStackTrace ();
}
}
}





It is impossible for a Set<String> to contain duplicate Strings. Whatever it is that you call duplicates must still be different elements (Maybe with trailing spaces, non printable characters.).
– OH GOD SPIDERS
Jul 2 at 13:23



Set<String>





.replaceAll(";;",";").replaceAll(";;;",";").replaceAll(";;;;",";").replaceAll(";;;;;",";"); doesn't look right. It should probably be replaceAll(";{2,}",";") or even avoided by using proper CVS parser.
– Pshemo
Jul 2 at 13:33


.replaceAll(";;",";").replaceAll(";;;",";").replaceAll(";;;;",";").replaceAll(";;;;;",";");


replaceAll(";{2,}",";")





Just a side note: don’t write bw.close () inside a try(BufferedWriter bw = …) statement. The whole purpose of the try-with-resource statement is that you are not required to do manual closing. And you are writing two files. Are you looking at the right file?
– Holger
Jul 2 at 13:45



bw.close ()


try(BufferedWriter bw = …)





Can you also include some content of your file which we could use to actually reproduce your problem? Without it we can't really help you beside guessing which isn't very efficient way of solving problems.
– Pshemo
Jul 2 at 13:46






Can you explain the rationale behind your decision to use replace or replaceAll? I have the strong feeling that you don’t understand the difference between these two methods at all.
– Holger
Jul 2 at 13:54



replace


replaceAll




1 Answer
1



before doing str.replace you can try str.trim to remove any spaces or unwanted and unseen characters.


str = str.trim()





Unfortunately, it did not help. I added a file on which I'm working, maybe it will help
– Łukasz Szumowski
Jul 2 at 14:27






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

s rm7YWGZlITv,HB1KkXERwjtUltdkxTo,t3vvI,B8QGxASdf2IDUGXHg0,Q1hpW5jTYKIhGXYLSVdKDU,Y 1,1kLc2NnU2
KkuAiUo4TJ

Popular posts from this blog

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?

Create weekly swift ios local notifications