Importing several delimited text files (by pipe: |). actual lengths of fields can vary. Several character strings contain tabs and carriage return type characters, which is throwing out the INFILE. The records themselves appear on a single row in the file. Character strings are not surrounded by double quotes. Affected data is free form text fields, from copy-paste (emails and similar). Proc import was originally tried but even with guessingrows max it isn't processing the file correctly. a basic SAS dataset with INFILE DLM and the like also processes the file incorrectly. Eventually used SAS Enterprise Guide's inbuilt "Import" function. This does read the correct number of records. But also caused fields to be read in with odd positioning: field A becomes field B and so on. Any thoughts?
... View more
I have been looking at this code syntax for hours, the Teradata code runs in a different environment, but bringing it through Enterprise Guide is giving me issues. The error I am getting is: ERROR: Teradata execute: Syntax error: expected something between ')' and ';'. I can't seem to figure out what the issue is. I appreciate any help or insight on this.
PROC SQL NOERRORSTOP;
CONNECT TO TERADATA (AUTHDOMAIN=xx server="xxx" connection=global mode=teradata);
execute (drop table TestServer.Test3) BY Teradata;
execute (create table TestServer.Test3 as
(
select *
from TestServer.Test1
UNION ALL
select *
from TestServer.Test2
))BY Teradata;
quit;
... View more
The fact that I even have to post this tells u what kind of day I'm having 😞 If the HLT_ALZDEM variable doesn't have a valid value of 1 or 2 (yes/no), then the variables that are constructed based on HLT_ALZDEM should be given a value of missing. However, they sometimes are given a value of 0 instead.
Below is a portion of the output file, sorted for rows where HLT_ALZDEM is missing. Since missing is clearly not 1 or 2, all ADRD_group and non_ADRD_group values should also be missing. But they're not. Obs 9 and 10 are identical except for these 2 variables., when Obs 9 is missing and Obs 10 is 0 for both.
The only possible explanation I can think of is that there are non-printable characters for some values of HLT_ALZDEM. But even if there are, those non-printable characters are not 1 or 2. Frequencies for HLT_ALZDEM only show values of 1, 2, and missing.
IF cohort_flag = 1 AND ADM_MA_FLAG_YR = 3
THEN MA_group = 1; ELSE MA_group = 0;
IF cohort_flag = 0 THEN MA_group = .;
IF cohort_flag = 1 AND ADM_FFS_FLAG_YR = 3
THEN TM_group = 1; ELSE TM_group = 0;
IF cohort_flag = 0 THEN TM_group = .;
IF cohort_flag = 1 AND HLT_ALZDEM = 1
THEN ADRD_group = 1; ELSE ADRD_group = 0;
IF cohort_flag = 0 THEN ADRD_group = .;
IF HLT_ALZDEM ^in (1,2) THEN ADRD_group = .;
IF cohort_flag = 1 AND HLT_ALZDEM = 2
THEN non_ADRD_group = 1; ELSE non_ADRD_group = 0;
IF cohort_flag = 0 THEN non_ADRD_group = .;
IF HLT_ALZDEM ^in (1,2) THEN non_ADRD_group = .;
Obs
HLT_ALZDEM
cohort_flag
MA_group
TM_group
ADRD_group
non_ADRD_group
1
0
2
1
0
1
0
0
3
1
1
0
4
0
5
0
6
1
0
1
7
1
1
0
8
1
1
0
9
1
1
0
10
1
1
0
0
0
11
1
1
0
12
1
0
1
13
1
0
1
14
1
0
1
15
1
0
1
16
1
0
1
0
0
... View more
I need to combine same clinic names if the corresponding prior_span =1.
For example, if Clinic_name1 and Clinic_name2 where both Clinic A and Prior_span1 and prior_span2 both =1 then Clinic would = A.
Clinic_name_1 corresponds with Prior_span1
Clinic_name_2 corresponds with Prior_span2
Clinic_name_3 corresponds with Prior_span3
Below is my sample data.
Thanks!
Data Have;
input
ID$ Clinic_Name_1$ Clinic_Name_2$ Clinic_Name_3$ Prior_Span1$ Prior_Span2$ Prior_Span3$ Cost service$ ;
datalines;
1 B A . 0 0 0 168 A
2 A A . 1 0 0 46 B
2 A A . 0 1 0 99 C
3 B . . 0 0 0 24 D
4 D . . 0 0 0 54 A
5 B D F 0 1 0 405 B
7 A . . 1 0 0 90 F
8 E E A 0 0 1 87 J
9 F . . 0 0 0 0 G
10 C C C 0 0 0 40 C
10 C C C 0 0 1 374 B
10 C C C 0 1 1 60 A
;
run;
***Summary table-clinic var not created yet****;
proc sql;
create table want as
select 1 as ID1,'service' as a length=40,Clinic,
1 as ID2,put(service, $40.) as b length=40,
1 as ID3,(select count(distinct ID) from have where Clinic=a.Clinic and service=a.service and cost>0 ) as c format=comma20.,
1 as ID4,calculated c/(select count(distinct ID) from have where Clinic=a.Clinic and cost>0 ) as d format=percent8.1,
1 as ID5,sum(cost) as e format=dollar20.,
1 as ID6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.1
from have as a
group by Clinic,service
union all
select 7 as id1,'09'x as a,Clinic,
7 as id2,'Total Clients' as b,
7 as id3,count(distinct ID) as f,
7 as id4,.,
7 as id5,.,
7 as id7,.
from have
where service ne 'j' and cost>0
group by Clinic
union all
select 7 as id1,'09'x as a,Clinic,
7 as id2,' Total Cost' as f,
7 as id3,. as c,
7 as id4,.,
7 as id5,sum(cost),
7 as id7,.
from have
where service ne 'j' and cost>0
group by Clinic
;quit;
options missing=' ';
proc report data=want nowd;
column id1 a b Clinic,(c d e f);
define id1/group noprint;
define a/group noprint ' ';
define b/group ' ';
define c/analysis 'Clients';
define d/analysis '% of Clients';
define e/analysis 'Cost';
define f/analysis '% of Cost';
define Clinic/across '';
compute before a/style={just=l color=black};
if id1=1 then len=0;
else len=10;
x=' ';
line x $varying10. len;
line a $40.;
endcomp;
run;
... View more
Hi, tsmk is my time-dependent covariate, deathstatus_inhance is censor, and stime_inhance is time. The "Change step" worked well, but the "Count step" only worked for four observations and stopped. Please help me to correct my code. DATA analysis.change;
SET analysis.smk_stime2;
ARRAY tsmk_(*) tsmk_1-tsmk_5; *call in the time-varying smoking variables;
ARRAY chng(5); *the new indicator variables;
t=1; *initialize the position variable for the indicator variables;
DO i = 2 TO 5;
IF tsmk_(i) NE tsmk_(i-1) THEN DO; *detects whether there is a change in smoking status;
chng(t) = i-1; *assigns the last year the status remained constant;
t=t+1;
END;
END;
RUN;
DATA analysis.count;
SET analysis.change;
ARRAY tsmk_(*) tsmk_1-tsmk_5; /* call in the time-varying smoking variables */
ARRAY chng(*) chng1-chng5; /* call in the indicator variables */
start = 0; /* initialize the beginning time for the study */
censor2 = 0; /* initialize the new censor variable */
t = 1; /* initialize the position variable for the indicator variables (chng1-chng5) */
DO i=1 TO stime_inhance; /* makes sure we only output the records that smoking status remains constant */
IF (chng(t) > . and chng(t) < stime_inhance) or i = stime_inhance THEN do;
/* assign the value of smoking status */
IF chng(t) > . THEN smoking_status = tsmk_(chng(t));
ELSE smoking_status = tsmk_(stime_inhance); /* assign the end time */
stop = min(chng(t), stime_inhance); /* assign the value of the censor variable */
IF i = stime_inhance THEN censor2 = deathstatus_inhance; /* assign the new start time */
IF t > 1 THEN start = chng(t-1); /* move the position variable */
t = t + 1;
OUTPUT; /* output the record to the new dataset */
end;
END;
RUN;
... View more