No Next Item No Next Conference Can't Favor Can't Forget Item List Conference Home Entrance    Help
View Responses


Grex Agora41 Item 69: The Windows > DOS software item
Entered by keesan on Mon Apr 8 14:13:21 UTC 2002:

Questions about how to use alternative (DOS) software to deal with MS files.

269 responses total.



#1 of 269 by keesan on Mon Apr 8 14:15:57 2002:

Someone just sent me a zip file which contains three .rtf files, two of which
start with the same 8 characters.  pkunzip, when it reached the second file,
said a file by that name already exists and gave me a choice of overwriting
the file or not.  I said no, and got out only two not three files.  Is there
some way to use pkunzip or some other DOS unzip program to get out all three
files?  And is there a DOS rtf reader that will handle 'Windows rtf' files?
The one I have (VIEW) does not seem to work, just gave me all the rtf tags.

The simplest solution was to ask for them to convert the rtf to ascii and give
8-character distinct file names.


#2 of 269 by keesan on Mon Apr 8 15:09:42 2002:

I converted the rtf to ascii and here is what I got.  Is this some cousin of
html?  Why would it be sent to a translator?  I omitted company name.

C:\DOWNLOAD\ANCILLAR.RTF


 
Normal;tw4winMark;tw4winError;tw4winPopup;tw4winJump;tw4winExternal;tw4winIntern
 l;tw4winTerm;DO_NOT_TRANSLATE;

<stf "F3.01">

<sourcecharset "Normal">

<sourcelanguage "English (US)">

<sourcehyphenation "Turn Off">

Solution_278004r2¨Phase1_convertions_for_translation¨mif¨278004-002-body.mif">

<bb "Cross-Reference Formats">

<bb "Variable"><:v "<:bb "paratext[Chapter Title]">" 1>

<vfmt "Running H/F 3"><:bb "paratext[Chapter Title]">

<bb "Variable"><:v "<:bb "curpagenum">" 2>

<vfmt "Current Page #"><:bb "curpagenum">

<bb "Variable"><:v "User's Guide" 3>

<vfmt "Manual Title">{>User's Guide<}{User's Guide<0}

(there was lots more - with all the actual English words doubled)


#3 of 269 by aruba on Mon Apr 8 15:36:48 2002:

RTF is a file format that stores typesetting information as well as text,
though the file itself is text.  If you want to send someone a Word
document, you can save it as RTF and send that without losing the
formatting.  It's easier to transmit, since it's text and smaller than the
original word doc, and portable to other applications (and among different
versions of Word).  It also can't contain viruses the way a Word doc can.


#4 of 269 by keesan on Mon Apr 8 16:13:12 2002:

That is interesting to know about the viruses and about RTF being portable
between versions of WORD.  I used a program called VIEW to convert the RTF
to ascii.  All those tags are TRADOS tags.  The person who sent the zip file
said she would send me an rtf file.  I have no idea what that will be.
What is TRADOS?  Why are all the English phrases in it doubled?  They
apparently converted the TRADOS to rtf.  This is all from a Czech translation
agency which for some reason is translating a computer manual from English
to Macedonian and found my website.  


#5 of 269 by keesan on Mon Apr 8 16:18:58 2002:

TRADOS is software for translations and localization.  The Czechs are doing
software localization - anyone have any experience in that?  We had a French
friend who was doing it and had to make the French short enough to fit into
the space occupied by the English.  His primary complaint is that the software
was badly written even in English.


#6 of 269 by krj on Mon Apr 8 17:04:20 2002:

(RTF stands for Rich Text Format, IIRC.)


#7 of 269 by rlejeune on Mon Apr 8 20:32:44 2002:

I thought it was an abbreviation of RTFM. 


#8 of 269 by keesan on Tue Apr 9 00:34:34 2002:

Still no new file.  I cannot even imagine what format the Macedonian might
arrive in.  We may end up going the scan-print-pdf route again.  I can convert
pdf to pbm or ps now and read it, at least.  Perhaps they can export to html?


#9 of 269 by gelinas on Tue Apr 9 03:16:16 2002:

Try moving the first file, giving it a new name, then let pkunzip overwrite
when it gets to the second file.  Thus you would end up with

        file1new                # the first file with a new name
        file1                   # the second file with the first name
        file3S                  # the third file


#10 of 269 by keesan on Tue Apr 9 13:21:25 2002:

How do I use pkunzip to unzip files one at a time rather than all at once?
This is purely theoretical by now. They finally realized that I am not a
native speaker of Macedonian, which I have been telling them all along, when
I pointed out that I cannot correct grammar errors in their Macedonian.
Somehow I was not expecting this to turn into paid employment.
I think somewhere I may have a .doc file for pkunzip.....


#11 of 269 by tsty on Tue Apr 9 13:51:04 2002:

pkzip/unzip comes with its own help file ...


#12 of 269 by keesan on Tue Apr 9 13:52:46 2002:

You mean pkunzip /h or /?  I will try that.  Thanks.  Curious what form the
Macedonian is in. One time someone sent me Cyrillic that consisted solely of
?? ? ????? ??. They saved it as text but not as Cyrillic.


#13 of 269 by keesan on Tue Apr 9 15:15:29 2002:

I think the idea with pkunzip was to unzip all the files, tell it not to
overwrite, rename the first file, and unzip again letting it overwrite.
But they figured out how to name all three files differently before zipping
them this time.  They sent me a zip file containing three files in Macedonain
- Macedon.zip (8 characters).  This file unzipped to mac- bod.rtf and mac-
fro.rtf and ancilar.rtf (they apparently shortened the long file names by
removing a number that was identical in both files, but did not shorten the
resulting names to 8 characters and body was cut off to bod).  The first two
files have spaces in them, which DOS is not happy with.  I could not view
either of them with my DOS-based VIEW program (file does not exist or is zero
length) or with Newdeal (which uses a menu system and found the files but
could not open them).  I could not rename the files with DOS rename.  I could
look at them with LIST by typing LIST *.rtf.  I could view the ancilar.rtf
file with Newdeal.  Newdeal does not handle Cyrillic and instead told me
Language 1024 and then /'e08/ etc. Lots of strings of coded characters.  Where
can I find a list of how the codes correspond to ASCII numbers, assuming that
they do?
        I had asked these people to convert their files to ascii or at least
to WORD so I could read them.  I can read WORD 6 for Win31.  Or they could
have printed everything out on a page or two and emailed me a scanned file.
Or converted to html if they knew how to do that.  Apparently they don't know
how to remove TRADOS tags - they said they removed some of them.  The TRADOS
version refers to PDF a few times.  I could have dealt with a PDF file but
not a TRADOS file of Cyrillic in Windows rtf format.


#14 of 269 by gull on Tue Apr 9 17:07:30 2002:

I think you can give pkunzip a list of files to extract from an 
archive, but it's been so long since I used it on the command line I 
can't remember.  These days I tend to use WinZip or Alladin Expander.


#15 of 269 by aruba on Tue Apr 9 21:27:16 2002:

You could use Norton Utilities to go in and change the names of those files
to not have spaces.

Word 6 for Win31 should be able to read rtf files, I think.


#16 of 269 by keesan on Wed Apr 10 14:37:20 2002:

Thanks, I will try Word 6 on them next time, if there is a next time.
I wish people would learn to use Windows enough to be able to convert their
files to ASCII.  Lots of Windows users don't even know how to rename files.

Next problem, which Mark has been trying to help us with.  Tim Ryan gave us
a partly-working Canon Multipass CMP3500 printer which uses a 25M zipped
(self-executing) driver that we cannot manage to fit into our computer.  Mark
suggested downloading one for a previous modem (3000) which we are about to
try.  But since we are not interesting in doing anything except simple
printing, would a driver from another Canon BJ printer work as well?  Win95
comes with lots of those already (BJ4000 takes the same cartridge, I think).
The C3000.exe file is 8M - I cannot imagine how they got the CMP3500.exe file
up to 25M when the two models appear to do the same things (fax, copy, print).
At one point the computer said it needed a bidirectional cable - is that the
same as IEEE shielded cable or would an ordinary printer cable work?  We can
get the IEEE for $5 from MCM Electronics.  Our friend offered to lend us the
one he bought for only $25 retail.

I have a non-windows program that claims to print with BJ4000 driver but it
told me the port was already in use - is that a cable problem?  The port was
in use only because we had it hooked up to the Canon printer.

A friend's Canon BJ printer will print from DOS (printscreen, text editor)
but this one will not.


#17 of 269 by blaise on Wed Apr 10 18:04:04 2002:

You could also use DOS rename -- "rename mac-?bod.rtf mac-body.rtf"


#18 of 269 by gull on Wed Apr 10 19:11:25 2002:

> Thanks, I will try Word 6 on them next time, if there is a next time.
> I wish people would learn to use Windows enough to be able to convert
> their files to ASCII.

Translation:  I want other people to learn Windows so I don't have 
to. ;)


#19 of 269 by keesan on Wed Apr 10 20:04:27 2002:

I would be delighted if nobody used WIndows, but if they insist on it they
should learn how to rename their files and convert them to something non-MS.
I will try using rename that way - did not know you could do it batch.

We are about to try two borrowed printer cables and the M3000 driver.
We can order inkjet ink for $30/pint, which is 12 refills for a 40ml cartridge
(or 24 for a 20 ml, or about 100 for a 4 ml).  Useful when the fax paper funs
out but people keep giving us more of it.


#20 of 269 by blaise on Thu Apr 11 14:00:00 2002:

While you can use rename batch, the particular example I gave was intended
to match just one file (which is why the replacement doesn't have any
wildcards).  An example of using rename in a batch mode would be "rename *.lst
*.txt" which would rename any files with an extension of lst to have an
extension of txt instead.


#21 of 269 by keesan on Thu Apr 11 14:50:19 2002:

How would the rename program deal with file 1.txt and file21.txt if I told
it rename file?1.txt file1.txt?  Would it not have to rename two different
files to the same name?


#22 of 269 by jep on Thu Apr 11 16:09:33 2002:

re #21: It will return "Duplicate file name or file in use" if there is 
a conflict for file names.  You would be attempting to give both files 
the same name, which would be a conflict.  

It's not too easy to make file names containing spaces in MS-DOS or the 
Windows command prompt.  I found I could do it, from the Windows 98 
command prompt anyway, by enclosing that part of the file name in 
quotes, for example:

   rename abc.txt "a c".txt

You could use this in reverse, too, to rename one file.  For example:

   rename "file 1".txt 1file.txt


#23 of 269 by keesan on Thu Apr 11 19:16:50 2002:

I just tried the first of these with MS DOS 6.22.
Parameter format not correct - "a
Maybe Windows DOS's can handle this but mine cannot.

Hans in Sweden informed me that 'port already in use' means that the printer
will work only with Windows.  Somewhere there is a driver only 3M long that
should work with the model 3000 but the link to it at driverguide is broken.
We will try the 8M model on the 3000 and maybe also on the 3500.
The 3000 claims to also work with Win31 but I could not find a driver.
How would one go about making a DOS driver for these printers?


#24 of 269 by aruba on Thu Apr 11 20:12:46 2002:

What do you mean by a DOS driver, Sindi?  The idea of having drivers for
printers built into the operating system is a difference between DOS and
Windows.


#25 of 269 by keesan on Thu Apr 11 20:17:42 2002:

A driver for a non-windows program, for instance for WP or Newdeal or printgf
or gifprint or ghostscript or the other programs I have been using to print
text or graphics with.  They all have Epson 9-pin and 24-pin and IBM 24-in
generic drivers that work with most dot-matrix printers.  Is there some
generic print driver that might work with a bubble-jet which will not even
Print Screen?  Newdeal has drivers for BJC-4000 and BJC-4200, and the 4200
and MPC3000 are supposed to work with the same Win95 driver but when I try
using ND it tells me 'port already in use'.


#26 of 269 by aruba on Thu Apr 11 20:20:04 2002:

Well, you should ask the manufacturers of those programs.


#27 of 269 by keesan on Thu Apr 11 20:22:24 2002:

I just wrote 'tech support' at Newdeal, which seems to be defunct but the
support person is still supporting it.  I don't think WP is supported any
longer but I will check user groups.


#28 of 269 by keesan on Thu Apr 11 20:34:14 2002:

EUREKA!  This was probably written by the guy who recruited me for beta
testing.  He is amazingly good at his job.

                                   NewDeal Technical Support Document 272

                       Specific Printer Notes, Canon
     _________________________________________________________________

  All Canon BubbleJets [long section on other models omitted here]

   QUESTION: Why does my Canon printer spit out a blank page between each
   printed page?

   ANSWER: Canon forces the printing to start either 1/4 or 1/2 inch from
   the top of the page, depending on the printer type. This causes the
   printable area on the page to be that much less (10 3/4 or 10 1/2 on
   an 11 inch page). The software thinks the paper is 11 inches long and
   keeps track, down to the micro inch, where it is on the paper. When it
   reaches the 10 1/2 inches, your printer calls the page done and ejects
   it. However, the software thinks there is still another 1/2 inch to
   print (usually the footer area). Now when you insert a new sheet of
   paper, the software finishes printing the rest of the page (normally a
   blank footer) and form feeds to the next sheet.

   To correct what the software thinks the paper size is, you need to run
   the Preferences application and change the default settings. Select
   the Printer button. In the next window, select the 'Change
   Defaults....' button. Change both the Document size and the Paper size
   to a height of 10.875 and click OK. Try a new document to see if the
   paper size is correct. You may need to go back and change the height a
   few times until it is correct (the height changes in .125 increments).
   Usually either 10.875 (for 1/4 inch forced top margin printers) or
   10.75 ( for 1/2 inch forced top margin printers) will work.

  Canon MultiPASS and MultiPASS C5000

   To use the MultiPASS with NewDeal, you will need to select an
   emulation printer driver from those listed within NewDeal. Close the
   MultiPASS Background. On the MultiPASS control panel:
    1. Press FUNCTION
    2. Press 0 on the numeric keypad
    3. Press the right search arrow '>' until #4 DOS Printing appears
    4. Press START/COPY, ON appears
    5. Press START/COPY again

   The display will read 'Printer Mode.' You are now ready to print from
   a DOS application.

   The MultiPASS can be used with the following print drivers in Windows
   and in DOS programs:
     * BJC 4300, 4200, 4100, 4000, 70, 600, 600e, 800 (color)
     * BJ 200, 200e, 200ex, 230, 20, 10ex
     * Epson LQ 2550 & 2500 (color)
     * Epson LQ 510,850, 500

   Note: The MultiPASS was designed to print using Windows applications.
   Canon does not guarantee print results from DOS.

   Last Modified 26 Jun 1999



#29 of 269 by keesan on Fri Apr 12 00:52:13 2002:

I just found a new way to remove spaces from files.  If they are sent to my
ISP webmail as an attachment and I download them, the name
20411-4 NYS Ed NC82.doc (three spaces) downloads and saves to my
home directory as 20411 (Unix apparently ends things just before a -).
Comments?
I suggested to someone who was going to fax me a document for translation that
he scan it and send it as a BW gif or even a BW pdf.  It came as WORD.
Now I get to test ANTIWORD, the latest weapon against MS, which claims to read
the most recent WORD even in Cyrillic and other character sets.  


#30 of 269 by keesan on Fri Apr 12 01:29:52 2002:

View hidden response.



#31 of 269 by mary on Fri Apr 12 01:35:58 2002:

There is something odd about that last post.  From a distance it 
looks like baby Jesus.  But the voice-over sounds more like
Willy Nelson.


#32 of 269 by remmers on Fri Apr 12 01:57:30 2002:

Looks like the Shroud of Turin to me.

(Suggestion for future reference:  Don't post binary files in
responses.  They can mess up a person's terminal settings.
Better to save the file in your home directory and tell
people what its name is.)


#33 of 269 by keesan on Fri Apr 12 02:12:56 2002:

Okay,take a look at 20411-4.doc    When I was downloading it seemed to imply
that this is WORD97 and I told one of my programs to view it as that.
I see (here, not when using pico) <84> <97> etc.  Perhaps these are codes for
Cyrillic characters?  I took a look at this with Jim's text editor and we
found the name of the company and the strings .tif and .png so I tried
renaming it to those and the view program could not recognize the format. 
This is an image viewer that does 40 formats of images. 
Every time I master a new format someone invents another one.  I can handle
gif and even print it, pdf (convert to pbm or ps and view and print those),
rtf (without Cyrillic, anyway), even various WORDs, but now they send TRADOS
and whatever this one might be.  I expect a fax tomorrow, I hope.
Would anyone with WORD97 like to take a look at the complete text of this one
and tell me why no programs can tell the number of lines in it?  PICO told
me 'long line length'.  I chopped off the beginning for anonymity but I can
forward you the whole thing.


#34 of 269 by keesan on Fri Apr 12 02:44:20 2002:

A few of the programs that we looked at this with identify it as WORD 97
or WORD 8 (same thing) and it came labelled .doc.  I sent a copy to a friend
with WORD97.  There are 299 lines each about three times normal length, which
is probably why my programs cannot identify the number of lines as they do
not recognize lines over a certain length.  Jim says some of the lines may
be extremely long and some very short.  They were full of nulls.


#35 of 269 by mdw on Fri Apr 12 03:45:00 2002:

I'm not sure what "20411-4.doc" is.  It looks vaguely like a word
document with all the NULs stripped out and perhaps otherwise mangled.
Word documents are stored in what MicroSoft calls "structured storage"
and the Linux world calls "OLE" files; it's a binary file structure
similar in some ways to a floppy disk image.  Removing the NULs and
otherwise randomly mangling it will destroy the data structures and
leave largely useless remains.

(Needless to say, pasting it in as a response will only further
mangle whatever binary data was left...)

"20411-4" (no .doc) on the other hand, *does* appear to be a valid
MicroSoft word document, at least after a fashion.  The "structured
storage" data structures are completely intact, so at that level at
least, it's completely intact.  I've forgotten at which offset nFIB is
stored, and I never completely understood which nFib corresponded to
what version of Word.  As best I can determine, the actual data stream
of the word document consists of just this: ^A^M which likely means
there's one special object and the obligatory paragraph ending.
I think all MicroSoft products that use "structured storage" always
arrange to write a special document summary stream, that contains
a few interesting strings, including:
        Copper Translation Service
        _PID_GUID
        _PID_HLINKS
        {5816E8C0-39C3-11D5-96A5-00107A185D3C}
and in unicode:
        E:\Current Copper Work\Untitled.tif
I'm guessing that the bracketed number is some sort of UUID, and may
perhaps be the registration number of the copy of Word that made this
file.  Maybe.  As a companion to to the WordDocument stream, there's the
"Data" stream, which contains the objects of the document.  I don't see
anything that looks like TIFF, but there is something that looks like a
valid PNG header at offset +0x133.  In fact, it appears to be a 1482 x
1287 1 bit non-interlaced image; you can get it here:
        http://www-personal.umich.edu/~mdw/20411-4.png
The first line appears to read:
        l. Srpskohrvatski jezik i knjizevnost           2 - -
There is a hat over the last "z" on the line, and there are about 22
more mostly similar text lines.  Looks like a report card, maybe.

And, before you ask, no, I specifically disclaim any interest in dealing
with any further files like this.  If *you* want to be able to do this,
you should strongly consider installing Linux, there's something called
"libole" out in the Linux world that understands MicroSoft "structured"
storage, and MicroSoft has published documents describe the content and
fields of MicroSoft word documents, which are well distributed on the
web today.  I learned about this stuff mainly so that I could read the
formatting information in word documents; I have no actual idea how the
"Data" stream is actually formatted, and was actually quite surprised to
see PNG magic in the file.


#36 of 269 by aruba on Fri Apr 12 03:57:41 2002:

Sindi, could you please expurgate #30?  It made my terminal unusable when I
read it; I had to log off and log back on again.


#37 of 269 by mcnally on Fri Apr 12 10:40:17 2002:

  It did nasty things to my xterm as well.


#38 of 269 by keesan on Fri Apr 12 15:06:40 2002:

How do I expurgate 30?  (The full command and which prompt to type it at
please).   I did not post the whole file or the part with the company name
to save embarrassing them in public but I suppose they will never know.
We saw something in there like untitled1.tif and at the end of the same line
also PNG but renaming it to PNG did not make something viewable.
The line you typed out for me means 'Serbocroatian language and literature'
so I suspect it is someone's diploma.   Marcus, I am astonished and grateful
that you were able to do this transformation and I hope never to come up with
this format again as the company should know better.  I told them to send a
gif or a pdf file.  A png would have worked.  A WORD png is something else.
They may have converted tif to png to WORD?
        I have notified NEWDEAL that their WORD filter does not work on these
things, but then again my friend with WORD97 also could not read anything in
it.  He tried various programs one of which said 'mangled email attachment'.
The author of VIEW 1.70 says it is a graphic and his program only converts
text.  So does ANTIWORD.
        I am off to download and maybe even translate Marcus' png.


#39 of 269 by keesan on Fri Apr 12 15:14:23 2002:

I tried, from the bbs prompt, expurgate 69 30 and scribble 69 30.  No luck.
png file downloaded, thanks Marcus.  It is 2/3 the size of the WORD (?) file.


Next 40 Responses.
Last 40 Responses and Response Form.
No Next Item No Next Conference Can't Favor Can't Forget Item List Conference Home Entrance    Help

- Backtalk version 1.3.30 - Copyright 1996-2006, Jan Wolter and Steve Weiss