C:\_G\WWW\~ELISANET\INFO\tscmd097.html
<http://www.elisanet.fi/tsalmi/info/tscmd097.html>
Copyright © 2003- by Prof. Timo Salmi  
Last modified Mon 29-Oct-2018 16:08:52

 
Assorted NT/2000/XP/.. CMD.EXE Script Tricks
From the html version of the tscmd.zip 1cmdfaq.txt file
To the Description and the Index
 

This page is edited from the 1cmdfaq.txt faq-file contained in my tscmd.zip command line interface (CLI) collection. That zipped file has much additional material, including a number of detached .cmd script files. It is recommended that you also get the zipped version as a companion.

Please see "The Description and the Index page" for the conditions of usage and other such information.



97} I need to remove duplicate entries from the output or a file.

Translating Eric Pement's solutions from his UNIX-flavored SED collection, his option is
  sed "$!N; /^\(.*\)\n\1$/!P; D"

To test it, let's have
  @echo off & setlocal enableextensions
  ::
  :: Make a test file
  set myfile_=MyFile.txt
  for %%f in ("%myfile_%") do if exist %%f del %%f
  for %%i in (1 2 2 2) do >>"%myfile_%" echo This is line %%i
  >>"%myfile_%" echo.
  for %%i in (6 6 8) do >>"%myfile_%" echo This is line %%i
  ::
  <"%myfile_%" sed "$!N; /^\(.*\)\n\1$/!P; D"
  ::
  :: Clean up
  for %%f in ("%myfile_%") do if exist %%f del %%f
  endlocal & goto :EOF

The contents of MyFile.txt will be
  This is line 1
  This is line 2
  This is line 2
  This is line 2

  This is line 6
  This is line 6
  This is line 8
and the output will be
  C:\_D\TEST>cmdfaq
  This is line 1
  This is line 2

  This is line 6
  This is line 8

If one uses third party utilities, and is prepared to go beyond awk/sed then a uniq.exe UNIX port is one option.

A Visual Basic Script (VBScript) aided solution demonstration as the third option:
  @echo off & setlocal enableextensions
  ::
  :: Make a test file
  set myfile_=MyFile.txt
  for %%f in ("%myfile_%") do if exist %%f del %%f
  for %%i in (1 2 2 2) do >>"%myfile_%" echo This is line %%i
  >>"%myfile_%" echo.
  for %%i in (6 6 8) do >>"%myfile_%" echo This is line %%i
  ::
  :: Build a Visual Basic Script
  set skip=
  set vbs_=%temp%\tmp$$$.vbs
  findstr "'%skip%VBS" "%~f0" > "%vbs_%"
  ::
  :: Run the script with Microsoft Windows Script Host Version 5.6
  <"%myfile_%" cscript //nologo "%vbs_%"
  ::
  :: Clean up
  for %%f in ("%vbs_%" "%myfile_%") do if exist %%f del %%f
  endlocal & goto :EOF
  '
  '.............................................
  'The Visual Basic Script
  '
  prev = "" 'VBS
  first = true 'VBS
  Do While Not WScript.StdIn.AtEndOfStream 'VBS
    str = WScript.StdIn.ReadLine 'VBS
    If (str <> prev) Or first Then 'VBS
      WScript.StdOut.WriteLine str 'VBS
    End If 'VBS
    prev = str 'VBS
    first = false 'VBS
  Loop 'VBS

The input and output will be as in the sed solution earlier.

If(!) you are prepared to accept omitting empty lines and ignore potential exclamation marks and other special character dilemmas, then with a pure script
  @echo off & setlocal enableextensions enabledelayedexpansion
  ::
  :: Make a test file
  set myfile_=MyFile.txt
  for %%f in ("%myfile_%") do if exist %%f del %%f
  for %%i in (1 2 2 2) do >>"%myfile_%" echo This is line %%i
  >>"%myfile_%" echo.
  for %%i in (6 6 8) do >>"%myfile_%" echo This is line %%i
  ::
  :: Process
  set prev=
  for /f "delims=" %%a in ('type "%myfile_%"') do (
    set str=%%a
    if not [!str!]==[!prev!] echo %%a
    set prev=!str!
    )
  ::
  :: Clean up
  for %%f in ("%myfile_%") do if exist %%f del %%f
  endlocal & goto :EOF

The output will be (note the dropping of the empty line)
  C:\_D\TEST>cmdfaq
  This is line 1
  This is line 2
  This is line 6
  This is line 8

Can this task be solved with a pure cmd script so that the empty lines are not omitted? Yes, but the solution is a bit kludgy and complicated. And the issue of the poison characters remains.
  @echo off & setlocal enableextensions enabledelayedexpansion
  ::
  :: Make a test file
  set myfile_=MyFile.txt
  for %%f in ("%myfile_%") do if exist %%f del %%f
  for %%i in (1 2 2 2) do >>"%myfile_%" echo This is line %%i
  >>"%myfile_%" echo.
  for %%i in (6 6 8) do >>"%myfile_%" echo This is line %%i
  ::
  :: Process
  for %%v in (prev LineCount) do set %%v=
  for /f "delims=" %%a in (
    'findstr /n /v /c:"SomeUnlikelyString" Myfile.txt') do (
      set str=%%a
      set /a LineCount+=1
      set /a mod = LineCount/10 + 2
      call :WriteOneLine "!str!" "!prev!" !mod!
      set prev=!str!
      )
  ::
  :: Clean up
  for %%f in ("%myfile_%") do if exist %%f del %%f
  endlocal & goto :EOF
  ::
  :: =============================================
  :WriteOneLine
  setlocal
  set str=%~1
  set prev=%~2
  set offset=%3
  set str=!str:~%offset%!
  set prev=!prev:~%offset%!
  if not [!str!]==[!prev!] echo.!str!
  endlocal & goto :EOF

The output will be
  C:\_D\TEST>cmdfaq
  This is line 1
  This is line 2

  This is line 6
  This is line 8

There is, however, a concise pure script solution which drops the duplicate lines irrespective of their location in the original file. It also drops all the empty lines. Recall that the original test file is
  This is line 1
  This is line 2
  This is line 2
  This is line 2

  This is line 6
  This is line 6
  This is line 8

  @echo off & setlocal enableextensions
  ::
  :: Make the test file
  set oldfile_=C:\_M\MyOldFile.txt
  set newfile_=C:\_M\MyNewFile.txt
  for %%f in ("%oldfile_%") do if exist %%f del %%f
  for %%i in (1 2 2 2) do >>"%oldfile_%" echo This is line %%i
  >>"%oldfile_%" echo.
  for %%i in (6 6 8) do >>"%oldfile_%" echo This is line %%i
  ::
  :: Start from scratch
  for %%f in ("%newfile_%") do if exist %%f del %%f
  ::
  :: Pick unique lines from the original file, i.e. our test file
  for /f "tokens=* delims=" %%a in ('type "%oldfile_%"') do (
    find /i "%%a" "%newfile_%">nul
    if errorlevel 1 echo %%a>>"%newfile_%"
    )
  ::
  :: Display the result
  type "%newfile_%"
  endlocal & goto :EOF
The output will be
  C:\_D\TEST>cmdfaq
  File not found - C:\_M\MYNEWFILE.TXT
  This is line 1
  This is line 2
  This is line 6
  This is line 8
Note using if errorlevel 1 instead of if !errorlevel! GTR 0 to avoid the need of using an enabledelayedexpansion.

To put things in perspective. By quite a coincidence I needed the other day to perform that task myself combining two lists of newsgroups and then removing the duplicates. I didn't have to think for one second which route to take when the situation came up for real. Skipped all the nice and fancy scripts and chose the UNIX port uniq.exe without any hesitation.

Also see Item #162. It includes the reverse task of listing duplicate lines.

References/Comments: (If a Google message link fails try the links within the brackets.)
  Google Groups May 4 2005, 10:50 am [M]
  Google Groups May 6 2005, 7:20 am [M]
  Google Groups Dec 21 2009, 7:29 am [M]

[Previous] [Next]

C:\_G\WWW\~ELISANET\INFO\tscmd097.html
C:\_G\WWW\~ELISANET\FTPCMD\TSALMI.CMD /tscmd097
http://www.elisanet.fi/tsalmi/info/tscmd097.html
file:///c:/_g/www/~elisanet/info/tscmd097.html