Skip to content

Inconsistent results when supplying cellTypes as factor vs character to toSF() #37

@Liuy12

Description

@Liuy12

Thanks for developing such a useful tool! I encouter some issues when analyzing my curio-seeker data.

  • I see significant differences in downstream results (makeShuffledCells / findTrends) depending on whether I pass a factor or a character vector to toSF().
  • Expected: identical analysis when labels are the same (factor vs character with same values).
  • Actual: results differ. Suspected cause: character → factor conversion inside toSF() creates alphabetical levels (factor ordering) which changes processing order when generating the shuffled list.

Reproduction

# factor input (preserve Seurat factor levels)
slide_f <- crawdad:::toSF(pos = coord,
                          cellTypes = seurat_obj$first_type)

# character input (explicitly converted to character first)
slide_c <- crawdad:::toSF(pos = coord,
                          cellTypes = as.character(seurat_obj$first_type))

scales <- c(50, seq(100, 1000, by = 100))
ncores <- 10
seed <- 1
perms <- 3

shuffle_f <- crawdad:::makeShuffledCells(slide_f, scales = scales, perms = perms, ncores = ncores, seed = seed)

shuffle_c <- crawdad:::makeShuffledCells(slide_c, scales = scales, perms = perms, ncores = ncores, seed = seed)

res_f <- crawdad::findTrends(slide_f, neighDist = 80, shuffleList = shuffle_f, ncores = ncores, verbose = FALSE, returnMeans = FALSE)
res_c <- crawdad::findTrends(slide_c, neighDist = 80, shuffleList = shuffle_c, ncores = ncores, verbose = FALSE, returnMeans = FALSE)

I also see similar issues when randomly re-level the cell type variable supplied to toSF(). I can share my data if needed. Please let me know your thoughts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions